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away  with  the  notion  that  the  generalized  Inverse  of  a 

matrix  Is  a  powerful  tool  In  optimization  problems,  a  tool 
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A 


the  m  X  n  matrix 

the  unique  Inverse  matrix  of  matrix  A 

the  unique  Moore-Penrose  generalized  Inverse 
of  A 

the  transpose  matrix  of  A 

the  matrix  A  whose  elements  come  from  the  J 
dimensional  field  of  variable  elements,  z 

J 

matrix  Inverse  of  A  that  satisfies  Penrose 

equation  CID.  The  subscript  notation  Is  used 
to  Identify  which  of  the  Penrose  equations 
are  satisfied  by  the  particular  Inverse 

alternative  notation  for  the  generalized 
1 nver  se 


ACX.Y!)  functional  notation,  where  A  Is  a  function  of 

X  and  Y 


I 

r 

OR 

jjmxn 

C 

^mxn 

IRC  A3 
T)CA3 
VF 

X 


b 

*0 


detC3 


the  r  X  r  Identity  matrix 

the  real  number  field 

the  real  field  of  m  x  n  matrices 

the  complex  number  field 

the  complex  field  of  m  x  n  matrices 

range,  or  column  space,  of  the  matrix  A 

null space  of  matrix  A 

gradient  function,  gradient  of  function  F 

the  vector  x,  the  underscore  character 
signifies  a  vector 

estimator  of  vector  b,  the  ^  symbol  stands 
for  estimator 

determinant  function 


min 


minimize,  as  In  finding  the  minimum  of  some 
functional  value 


max 

opt 

s.  t. 


maximize,  as  in  finding  the  largest  of  some 
f uncti onal  val  ue 

optimize,  general  term  meaning  either 
minimize  or  maximize,  whichever  is  proper 

subject  to  or  such  that,  normally  used  in 
conjuction  with  optimization  problem 
constraints  that  must  be  satisfied. 

subset  of 

an  element  of 

less  than  or  equal  to 

greater  than  or  equal  to 

not  equal  to 

sum  of  the  m  elements  indexed  by  1  and 
sequenced  from  i  to  m,  inclusive 

used  to  mean  is  equivalent  to 

contains  elements  of  a  set 

as  a  superscript  denotes  orthogonal 
compl ement 

used  to  denote  multiplication  between  two 
matrices 
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This  thesis  examines  the  applications  of  the 
generalized  Inverse  of  a  matrix.  In  particular,  use  Is  made 
of  the  generalized  Inverse  of  a  matrix  containing  variable 
elements.  Such  matrices  are  referred  to  as  mul tl parameter , 
polynomial,  or  variable  element  matrices.  The  notion  of  a 


generalized  Inverse  In  fact  -‘general 


lzes"^tl 


the  concept  of  a 


matrix  Inverse.  A  matrix  Inverse  exists  only  for  square, 
non-singular  matrices.  The  generalized  Inverse  extends  this 
notion  to  non-square,  singular  matrices.  The  classical 
matrix  Inverse,  when  It  exists.  Is  a  unique  element  of  the 
set  of  generalized  Inverses  for  the  matrix. 

Many  modern  problems  Involve  mul tl parameter  matrices. 
The  ability  to  obtain  Inverses  for  such  matrices,  both 
singular  and  non-singular.  Is  a  necessity  In  solving  these 
pr  obi ems . 

This  thesis  consolidates  the  theory  of  generalized 
Inverses,  Including  extensions  to  mul tl par a meter  matrices. 

An  In  depth  discussion  Is  made  of  the  ST  method  for 
computing  all  generalized  Inverser  of  a  matrix  as  well  as 
the  strong  Interface  between  the  ST  method  and  the 
Fundamental  Theorem  of  Linear  Algebra.  Finally  selected 
application  problems  are  solved  demonstrating  the  utility  of 
the  generalized  Inverse  in  such  problems,  i 
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NOILINEAR  OPTIMIZATION  INVOLVING 

PCX-YNOMIAL  MATRICES  AND  THEIR 
GENERALIZED  INVERSES 


I .  Introduction 


Background 

This  thesis  primarily  involves  the  solving  of  highly 
nonlinear  optimization  problems.  The  approach  taken  is  to 
focus  on  using  generalized  inverses  of  polynomial  matrices 
as  a  tool  for  solving  these  problems.  Since  matrices  are 
often  used  to  describe  the  nonlinear  problem  as  well  as  to 
determine  optimal  solutions,  they  play  a  vital  role  in 
optimization  theory.  The  generalized  inverses  of  these 
matrices,  whether  the  matrices  have  polynomial  or  constant 
elements,  can  provide  a  powerful  solution  technique  in  many 
cases. 

Optimization  theory  involves  the  problem  of  finding  the 
extremum  pointCsv  of  some  objective  function.  This  function 
may  or  may  not  be  subject  to  a  set  of  constraining 
functions.  These  constraining  functions  limit  the  feasible 
region  of  potential  solutions  for  the  objective  function. 

An  optimization  study  may  be  concerned  with  many 
different  types  of  problem.  The  study  may  be  concerned  with 
the  system  cost  for  a  new  space  vehicle  launch  system,  or 
the  problem  of  modeling  the  heat  dispersion  of  air  flow  over 
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a  wing  for  some  fixed-wing  aircraft.  The  study  can  of  course 
address  other  problems  in  areas  such  as  control  theory, 
reliability,  or  functional  analysis  problems.  The  point  is 
there  are  many  examples  covering  all  areas  of  current 
research. 

In  conducting  an  optimization  study,  the  analyst  must 
draw  upon  knowledge  from  numerous  fields.  The  mathematical 
disciplines  often  used  are  matrix  and  vector  theory, 
calculus  and  differential  equations,  and  possibly  some 
abstract  mathematical  theory  and  finite  element  methods.  And 
of  course  in  today’s  complex  environment  a  thorough 
knowledge  of  computers  and  the  computer  algorithms  employed 
to  numerically  find  the  "best  candidate"  solution  are 
valuable  assets. 

A  working  definition  of  optimization  theory  can  be 
stated  as,  "optimization  theory  is  a  body  of  mathematical 
results  and  numerical  methods  for  finding  and  identifying 
the  best  candidate  from  a  collection  of  alternatives  without 
having  to  explicitly  enumerate  and  evaluate  all  possible 
alternatives"  C38:1Z>,  This  working  definition  fits  nicely 
into  the  previously  described  framework  for  the  optimization 
study. 

A  quick  look  into  any  text  on  optimization  theory 
reinforces  the  statements  Just  made.  In  addition  to  the 
numerous  fields  of  knowledge  and  research  Involved,  the 
optimization  text  provides  an  appreciation  of  the  many 
aspects  of  optimization  that  must  be  considered,  not  only  in 
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formulating  the  problem,  but  also  in  solving  the  problem. 

As  initially  stated,  the  optimization  problem  may  be 
constrained  or  unconstrained.  In  unconstrained  optimization, 
the  function  can  take  on  any  defined  numerical  value.  In 
such  cases.  Iterative  techniques  are  quite  efficient.  Some 
popular  techniques  include  Golden  Section  and  Fibonacci 
algorithms  C49:121!)  for  single-dimensional  problems.  For 
higher -dimensional  problems,  calculus  techniques  involving 
the  gradient  function  are  the  usual  choice. 

Constrained  optimization  must  search  for  the  best 
candidate  in  some  predetermined  region  of  the  number  field. 
In  order  to  be  a  valid  candidate  for  the  optimal  solution  a 
set  of  constraining  functions  must  be  satisfied  by  the 
candidate  objective  function  solution.  The  two  dominant 
techniques  in  the  linear  function  arena  Ci.e.  linear 
programming^  are  the  simplex  algorithm,  first  developed  by 
Dantzig  C8:143,  and  Karmarkar’s  algorithm  C14;753.  For 
nonlinear  functions,  there  are  gradient  search  techniques, 
penalty  function  techniques,  and  iterative  linear 
approximation  techniques,  to  name  a  few  C13:viD.  In  addition 
there  is  the  classical  Lagrange  multiplier  method  and  the 
Kuhn-Tucker  optimality  conditions  that  are  the  basis  for 
most  other  techniques  in  addition  to  being  a  solution 
technique  in  themselves  C 38; 184-200?  . 

Regardless  of  the  technique,  the  theory  of  matrices  is 
a  key  player.  Once  again,  nearly  all  optimization  texts  have 
an  appendix  or  chapter  dedicated  solely  to  matrix  theory. 
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Matrix  theory  is  vital  to  understanding  problem  formulation 
and  then  understanding  the  solution  techniques.  It  is  this 
matrix  theory  that  is  the  driving  force  of  this  thesis.  In 
particular,  this  thesis  explores  the  extensions  of  that 
classical  matrix  theory  to  include  the  generalized  inverses 
of  all  matrices. 

The  first  attractive  thing  about  using  matrices  is  the 
very  compact,  easy  to  follow,  problem  formulations  obtained 
using  matrices.  Systems  of  equations  are  very  neatly 
summarized  using  matrices.  The  following  example 
demonstrates  this. 

Consider  the  following  system,  which  is  an  example  from 
Chapter  IV.  Disregard  for  the  moment  that  the  matrix  in 
Cl . 3D  below  contains  variable  elements  Ci.e.  it  is  a 
polynomial  or  multi  parameter  matrlxD: 

400X*  -  400XY  +2X  =2  Cl. ID 

-200X*  +  200Y  =  O 

This  may  be  written  in  a  much  more  compact  notation  as: 

ACX,YD  (XJ  =  BCX,YD  Cl. 2D 

where 


and 


ACX,YD 


14CXDX  *+  2 
-200X 


-400X 

200 


Cl.  3D 


BCX,YD  =  (a 

bJ 


Cl .  4D 


Equation  Cl. 2D  is  easier 
Cl. ID  since  it  is  uncluttered 


to  understand  than  equation 
and  more  compact.  The  problem 
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solution  can  then  be  derived  using  the  same  matrix  notation. 
For  example  a  problem  of  the  form  Cl . 8D  is  easily  solved 
using  the  notion  of  a  matrix  inverse.  Should  this  Inverse. 
A~*,  exist,  the  unique  solution  is  given  by: 

|xj  =  A“*CX.Y3  BCX,Y3  Cl.SD 

However,  not  all  matrices  have  this  "classical"  matrix 
Inverse.  In  these  cases,  the  notion  of  a  generalized  inverse 
is  used  to  solve  the  optimization  problem.  More  details  will 
come  later,  but  the  generalized  Inverse  does  in  fact 
generalize  the  idea  of  a  matrix  inverse  since  the  A  * 
inverse  matrix,  when  it  exists.  Is  identical  to  the 
Moore-Penrose  generalized  Inverse  matrix  C46:138D. 

The  history  of  the  generalized  Inverse  is  short.  Some 
initial  work  in  the  1920’s  and  1950*s  was  followed  by  a 
flurry  of  activity  in  the  late  1960’s  through  the  early 
1980’ s.  However,  this  VMsrk  dealt  primarily  with  constant 
coefficient  matrices.  For  most  problems  in  areas  such  as 
statistics,  control  thc»ory.  and  even  optimization,  this 
limited  applicability  was  sufficient.  The  next  chapter 
surveys  a  cross  section  of  these  application  areas.  But  as 
shown  in  equation  Cl . 33 ,  matrices  can  contain  variable 
coefficients.  This  then  is  an  example  from  the  newest,  most 
rapidly  expanding  area  of  research  involving  generalized 
inverses,  that  of  multi parameter  matrices,  or  matrices  that 
contain  variable  elements. 
groteifflB 
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Much  of  the  work  done  involving  generalized  Inverses  is 
in  the  applications  of  inverses  of  matrices  with  constant 
coefficients.  It  has  not  been  until  recently  that  interest 
has  turned  towards  working  with  matrices  having  variable 
elements.  The  reason  for  this  interest  is  due  to  the  ever 
increasing  complexity  of  modern  systems  as  well  as  the 
ability  of  modern  supercomputer  systems  to  manipulate  and 
evaluate  variable  element  equations  and  matrices.  Such 
expert  systems  as  MACSYMA  C50^  enable  the  user  to  manipulate 
purely  symbolic  equations  or  matrices.  The  ability  of  a 
system  to  handle  variable  elements  is  necessary  in  modern 
systems  theory  involving  multi  parameters  C20;253  ; 
and  large  scale  network  problems  C4S;514:>.  to  name  two 
exampl es . 

Thus,  work  must  be  done  to  enable  users  to  solve  such 
multi parameter  systems.  This  thesis  brings  together  the 
theory  set  forth  to  date  and  demonstrates  the  use  of  the 
generalized  Inverse  of  a  multi parameter  matrix  as  a  tool 
applied  to  selected  applications  such  as  control  theory  and 
nonlinear  optimization.  The  ability  to  use  this  technique 
gives  the  analyst  an  often  powerful  technique  for  solving 
complex  problems. 

Research  Objective 

This  thesis  provides  a  concise,  yet  thorough, 
compilation  of  generalized  inverse  theory.  The  theory  is 
then  used  to  solve  practical  examples  from  fields  such  as 
optimization  and  control  theory.  The  theory  and  examples 
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provide  the  reader  with  an  appreciation  of  how  valuable  a 
generalized  inverse  can  be  in  solving  a  wide  range  of 
pr  obi ems . 

Approach  and  Presentation 

Chapter  II  reviews  a  cross-section  of  the  generalized 
Inverse  field.  The  purpose  of  this  review  is  to  emphasize 
the  range  of  applications  and  the  way  in  which  the 
generalized  inverse  does  in  fact  provide  a  very  general 
solution  format.  It  begins  with  a  brief  history  of  the 
theory,  followed  by  discussions  of  various  applications. 
First  among  the  applications  is  regression  analysis, 
followed  by  nonlinear  optimization  techniques.  Each 
discussion  explains  how  the  technique  is  implemented  and  how 
the  generalized  Inverse  plays  a  crucial  role.  The  final 
section  of  this  chapter  looks  at  some  of  the  various 
computational  techniques  available.  Again  the  goal  is  to 
explain  the  techniques,  not  Just  enumerate  them. 

Chapter  III  presents  a  consolidation  of  the  theory  and 
knowledge  at  the  basis  of  this  thesis  work.  The  Intent  is  to 
bring  together  in  one  concise  chapter,  the  relevant  theorems 
presented  to  date,  supplemented  by  discussions  of  the 
theorems.  As  a  result,  this  chapter  also  highlights  the 
trend  in  the  theory  regarding  multi parameter  matrices.  For 
the  most  part,  the  theorems  are  presented  without  the  proofs 
but  contain  references  where  the  proofs  can  be  found.  Each 
theorem  is  discussed  to  enhance  reader  understanding  of  the 
generalized  inverse  theory  being  presented. 
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Chapter  IV  applies  the  theory  from  Chapter  III  In 
specific  application  areas.  The  first  topic  is  the  detailed 
steps  to  follow  in  computing  the  generalized  inverse  using 
the  ST  computational  technique.  This  follov^  directly  from 
the  theory  laid  out  in  the  previous  chapter.  The  first  three 
applications  presented  involve  unconstrained  optimization, 
implicit  function  theory,  and  constrained  optimization, 
respectively.  The  chapter  concludes  with  a  robust  control 
theory  problem  and  an  example  that  employs  the  theory 
regarding  common  solutions  to  sets  of  matrix  equations. 

Finally,  in  Chapter  V,  the  thesis  is  summarized  and  the 
important  points  are  reiterated.  The  final  point  made  is 
recommendations  for  future  areas  of  research  regarding 
multi  parameter  matrices. 
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II.  LITERATURE  REVIEW 


I ntroduct 1 on 

The  purpose  of  this  chapter  is  to  examine  the  current 
knowledge  in  the  area  of  generalized  inverses  of  matrices. 
This  concise  summary  focuses  on  the  theory  and  applications 
of  generalized  Inverses,  and  is  divided  into  three  main 
secti ons. 

The  first  section  briefly  discusses  the  history  of  the 
generalized  inverse.  Some  specific  application  areas  follow 
in  section  two.  The  areas  discussed  represent  a  small  cross- 
section  of  the  application  areas.  The  emphasis  is  on  the 
diversity  of  the  field,  while  providing  an  understanding  of 
Just  how  the  particular  technique  under  examination  applies, 
and  exploits,  the  generalized  Inverse  matrix.  The  final 
section  addresses  techniques  developed  to  compute  the 
various  classes  of  generalized  inverses. 

History. 

Given  any  square  matrix.  A.  if  the  determinant  of  A  is 
non-zero  Ci.e.  ,  DetCAD  ^  03  then  there  exists  a  matrix.  A~*, 
that  satisfies  the  property 

a~‘a  =  a  A'‘=  I  ca.i3 

%rtiere  I  is  the  Identity  matrix.  The  matrix  A”*  is  called  the 
inverse  of  A  and  A  is  said  to  be  Invertible  or  nonsingular. 
However,  if  A  is  non-square,  or  is  square  with  a  zero 
determinant  Ci.e.  ,  DetCAD  »  03.  then  there  is  no  matrix  B 
such  that 
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A  B  =  B  A  =  I  Ca.BD 

and  A  Is  said  to  be  a  singular  matrix.  In  cases  where  the 
matrix  A  is  singular.  Inverses  from  a  larger  class  of  matrix 
Inverses  must  be  computed.  This  is  the  class  of  generalized 
1  n ver ses ,  of  whl ch  A  *  Is  a  uni que  el ement  when  1 1  does  1  n 
fact  exist. 

In  his  1985  AFIT  thesis.  Murray  C27:1-3I>  classified  the 
history  of  generalized  inverses  by  Identifying  five  key 
developments.  The  first  occurred  in  1903  when  Fredholm 
introduced  the  concept  of  the  generalized  inverse,  calling 
the  inverse  matrix  a  pseudoinverse.  In  1920,  Moore  proved 
algebraically  the  concept  of  a  unique  generalized  Inverse 
for  every  finite  matrix.  He  called  his  matrix  a  general 
reciprocal  matrix.  It  wasn't  until  1951,  in  work  done  by 
BJerhammer,  that  the  relationship  of  this  generalized 
inverse  was  extended  to  a  system  of  linear  equations. 

Using  BJerhammer *s  results,  yet  apparently  unaware  of 
Moore's  earlier  %^rk.  Penrose  showed  that  this 
generalization  of  the  "classical”  matrix  Inverse  was  unique 
for  every  matrix.  Penrose  defined  four  conditions  that  this 
inverse  must  meet.  These  Penrose  conditions  C32:406^  are 
used  as  a  basis  for  classifying  all  the  generalized  inverses 
of  a  matrix.  The  conditions,  along  with  Penrose's  original 
theorem,  are  presented  and  discussed  in  depth  in  Chapter 
III.  For  now.  the  conditions  are  presented  without  proof  for 
discussion  purposes.  The  four  conditions  that  Penrose 
identified  are: 
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ca.  3D 


A  A*  A  =  A 
A*  A  A*  =  A"^ 

CA  A*D*  =  A  A^ 

CA"^  AD*  =  A*  A 

where  ~  denotes  the  conjugate  transpose  of  the  matrix. 

The  matrix  that  satisfies  these  four  equations,  denoted 
as  A^,  is  referred  to  as  the  Moore-Penrose  generalized 
inverse  in  recognition  of  contributions  from  both 
researchers.  Later  work  involved  matrices  that  satisfy  some, 
but  not  necessarily  all,  of  the  Penrose  equations.  For 
example,  a  matrix  B  that  satisfies  condition  CID  may  not  be 
unique,  but  is  sufficient  for  use  in  solving  sets  of  linear 
equations  C28:127D.  A  matrix  that  satisfies  conditions  CID 
and  C2D  is  often  referred  to  as  a  weak -general 1  zed  inverse, 
WGI ,  or  the  A^  ^  generalized  inverse. 

Murray  identifies  the  fifth,  and  final  major 
development,  to  be  the  Jones  ST  method  of  computing  all  the 
generalized  inverses  of  a  matrix  C27;3D.  Since  the  ST  method 
is  the  particular  technique  used  in  this  thesis,  a  more 
in-depth  discussion  of  the  technique  is  provided  in  Chapter 
III. 

Applications. 

In  1956  Penrose  sho%iwd  that  the  unique  A^  generalized 
inverse,  ¥^en  used  to  solve  an  equation  of  the  form: 

Ax  =  ^  C2.  4D 

provided  the  least -squares,  minimum-norm  solution  vector  x. 
However,  there  are  other  classes  of  generalized  inverses  for 
a  matrix  that  satisfy  only  a  portion  of  the  four  Penrose 


CID 

C2D 

C3D 

C4D 


11 


equations.  Since  only  the  A  is  unique  for  any  matrix  A, 

other  inverses  are  members  of  subsets  of  inverses.  For 

instance,  an  A  matrix  is  not  unique  for  a  given  matrix  A. 

1.2 

but  Just  one  member  of  the  subset  of  A  matrix  inverses. 

1.2 

Some  ideas  from  set  theory  C47:21B2>  show  this  subset 
r  el at 1 onshi  p : 


A  SCA  .A  :>£CA3SCA.AJ  CE. 

1.2.S  1.2,4  1.2  1  2 

This  set  relationship  is  shown  pictorlally  in  Figure  1 
C27: 53. 


After  Penrose's  original  work,  much  was  done  in 
identifying  the  properties  of  these  subsets  of  inverses.  As 
Penrose  proved,  the  A*  matrix  provides  the  1 east -squares . 
minimum-norm  solution  to  an  equation  of  the  form  C2.43. 

However,  a  lesser  inverse,  the  A  ,  generates  the 

1.2 

least-squares  estimator.  Similarly,  the  A^  ^  inverse 
generates  the  minimum-norm  solution  C28: 129-1323 .  As 
expected,  these  generalized  inverses  come  up  quite  often  in 
statistical  applications,  as  the  next  section  illustrates. 

Statistical  Applications.  A  regression  analysis  model 
is  of  the  form; 


=  Xb  +  jc  C2.e3 

where  ^  is  the  vector  of  dependent  variables,  X  is  the 
matrix  of  independent  variables,  and  e  represents  a  random 
error  component  with  zero  mean  and  knom  covariance  matrix, 
V.  The  vector  b  of  regression  parameters  describes  a  linear 
relationship  between  the  independent  variables  and  the 
dependent  variables.  In  reality,  the  values  of  b  are  unknown 
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and  must  be  estimated  from  the  data.  A  key  question  is  how 
to  select  the  best  estimators  of  b. 


Figure  1:  Sets  of  Generali  zed  Inverses 


These  estimators  are  generally  denoted  as  b  Cor 
some  texts^ .  Desirable  properties  for  the  vector  of 

A 

estimators,  b.  are  minimum  variance  and  that  It  be  an 
unbiased  estimator  of  the  true  parameters,  b.  If  it 
satisfies  these  properties  It  is  called  a  best  linear 
estimator  CBLE3.  Nelson  C 30: 1-103  examined  the  use  of 
generalized  inverses  In  regression  analysis,  both  in 
unconstrained  and  constrained  regression  analysis.  In  both 
cases,  the  BLE  for  the  problem  was  found  In  terms  of  the 
Moore-Penrose  generalized  inverse. 
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The  first  case  discussed  is  that  of  the  unconstrained 
case  C30:2D.  In  this  instance  the  BLE  of  C2.65  is  given  by: 

b  =  ex’"  XD*  X"*"  V"‘  Y  Ca.7D 

and  the  covariance  of  b  is  defined  as: 

covCbD  =  CX’  V”‘  XD*  C2.  83 

Nelson  considered  equality  constraints  C30:2-73  on  b  of 
the  form: 


A  b  =  t  C2. 93 

C inequality  A  b  <  t  3 

In  this  case,  his  best  restricted  linear  estimator  CBRLE3 
was  def i ned  as : 

b  =  A^^t  +  X  V"‘C;^  -  X  A*t3  C2.  103 

C  =  Cl  -  A‘'A3  x’*  V“‘  X  CI  -  a" A3 

and 

covCb3  =  C*  C2.  113 

Nelson’s  methodology  is  more  involved  when  the  problem 
involves  inequality  constraints  C30: 7-103  of  the  form  in 
C2.93.  He  first  determines  whether  the  unconstrained  optimum 
is  feasible,  with  respect  to  the  constraints.  If  not.  then 
he  uses  the  fact  that  the  optimum  must  lie  on  a  constraint 
boundary.  This  Implies  that  a  certain  subset  of  constraints 
Intersect  to  form  the  constraint  boundary  where  the  optimal 
point  resides.  Since  the  points  lies  on  the  boundary,  the 
constraints  must  be  satisfied  as  equality  constraints. 

Using  this.  Nelson  essentially  conducts  a  tree  search 
of  each  basis  or  subset  of  constraints  satisfied  as 
equalities  C other  constraints  must  still  be  satlsfled3.  In 
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this  manner,  the  optimum  solution  is  found  in  a  finite 
number  of  steps.  However,  this  procedure  is  combi natori ally 
inefficient  for  problems  involving  larger  constraint  sets. 

Stewart  C45;634-C62D  did  work  somewhat  similar  to 
Nelson.  His  work  differed  in  that  he  was  primarily 
Interested  in  how  perturbations  in  A.  due  to  uncertainty 
about  A,  effect  the  generalized  inverse  A^,  the  minimum-norm 
solution.  A*b,  and  the  least-squares  problem. 

Another  area  is  the  study  of  Markov  chains,  either 
discrete  or  continuous,  examined  by  Hunter  in  1982.  In  his 
work.  Hunter  C16D  characterized  all  the  generalized  inverses 
of  the  matrix  Cl -P5 .  For  discrete  Markov  chains,  P  is  the 
one-step  transition  matrix.  In  the  continuous  case,  P  is  the 
infinitesimal  generator.  Hunter  found  the  necessary 
stationary  and  first  passage  time  distributions  for  the 
problem  in  terms  of  these  generalized  Inverses.  In  proving 
his  results.  Hunter  refuted  previous  work  claiming 
generalized  Inverses  were  not  applicable  to  the  study  of 
Markov  chains  CIO:  190-1073  . 

The  generalized  Inverse  arises  in  many  other 
statistical  settings  Involving  singular  matrices.  For 
example,  Rao  C 30: 201-2033 .  Albert  C 28: 2253.  and  Henk  Don 
Cl 2: 225-2403  examined  Maximum  Likelihood  Estimation 
Involving  singular  information  matrices.  Rao  and  Yanal  C373 
looked  at  the  Oauss-Markov  model  and  in  particular  the 
models  involving  generalized  inverses  of  partitioned 
matrices.  Hsuan  Cl 53  examined  some  specific  uses  of  the  A^ 
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generalized  inverse  matrix.  Among  the  areas  he  examined  were 
the  least -squares  problem  and  the  conditions  under  which  the 
quadratic  form  of  a  multivariate  normal  random  variable  will 
follow  a  chi-square  distribution  C15: 245-247D .  A  classic 
text  by  Rao  and  Mitra  provides  chapters  dedicated  to  Just 
statistical  applications  of  the  generalized  inverse  matrix 
C36:  136-168D. 

Nonlinear  Optimization.  As  previously  discussed  in 
Chapter  I.  matrices  play  a  key  role  in  optimization  studies. 
This  section  looks  at  some  optimization  techniques  involving 
the  generalized  inverse. 

A  quadratic  programming  problem  is  a  particular  type  of 
nonlinear  programming  problem  CNLP5  involving  a  quadratic, 
convex,  objective  function,  subject  to  a  set  of  linear 
constraints  C49:14J.  Nelson  C29;l-21>  discusses  a  quadratic 
programming  problem  of  the  form: 

max  fCxD  C2.  123 

s .  t .  g^  C  ^  <  O  for  i  =1 , .  .  .  ,  I 

For  any  1  "^asible  solution  to  the  problem,  only  a 
portion  of  the  I  linear  constraints  are  binding.  For  any 
feasible  solution,  x,  a  constraint  can  be  binding  or 
non-binding.  If  the  functional  value  at  x  lies  on  the 
boundary  of  the  feasible  region  defined  by  a  constraint, 
then  that  constraint  is  a  binding  constraint.  If  the 
functional  value  is  not  on  the  feasible  region  boundary 
defined  by  the  constraint  then  the  constraint  is  considered 
a  non-binding  constraint.  Nelson  examines  possible 
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combinations  of  constraints  as  if  that  combination  of 
constraints  were  binding  C treated  as  strict  equality 
constraints^  and  determines  the  optimal  solution  given  the 
particular  combination  of  constraints.  If  the  resulting 
solution  satisfies  all  the  constraints  the  solution  is  a 
potential  optimal  solution.  Once  all  possible  combinations 
of  binding  constraint  sets  are  examined,  the  best  potential 
solution  obtained  is  the  optimal  solution  C29: 19-303. 

Nelson  uses  the  generalized  inverse  to  handle  the 
non-square,  singular,  matrices  that  result  from  partitioning 
the  constraint  set.  Thus,  a  more  general  technique  is 
obtained  than  if  the  classical  A  *  inverse  were  used.  Since 
this  is  the  same  technique  Nelson  employs  for  inequality 
constrained  least-squares  problems,  the  technique  suffers 
from  the  same  combinatorial  inefficiencies  as  before.  For 
large  problems  Involving  many  constraints,  this  technique 
would  be  very  cumbersome  and  impractical. 

Shank land  avoided  the  combinatorial  complexity  of 
Nelson’s  algorithm  in  his  quadratic  programming  technique. 
The  formulation  he  used  was  C40:S3: 

max  S  =  -  1/2  x'^B  x  C2.  133 

s.  t.  C  x  -  d  <  O 

where  B  is  a  positive  definite,  symmetric  matrix. 

Shankland  first  decomposes  B  into  the  product  of  a 
lower  triangular  matrix,  L,  and  it’s  transpose,  L^,  and 
performs  the  three  following  transformations  on  the 
formulation  of  C2. 133: 
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B 


C2. 143 


=  L 

X'  = 

a’  =  L“*a  and  C’  =  L~‘  C 

•»  ^ 

Shankland’s  final  transformation  shifts  the  origin  to 
the  unconstrained  maxima  of  C2. 133,  a  point  he  calls  a’. 
Using  the  transformations: 

X*  •  =  X’  -  a'  C2. 153 

d*  =  d  -  c’’  a* 
produces  the  final  formulation  C40:33: 

max  S  =  1X2  C  a'^a’  -  x’'’‘x*’3  C2.163 

s.  t.  C*’‘x*  •  -  d*  <  O 

If  the  feasible  region  defined  for  C2.  163  contains  the 
origin  Ci.e.  ,  shifted  orlgln3,  then  the  origin  is  the 
solution.  If  not,  then  the  point  on  the  surface  of  the 
feasible  region  closest  to  the  new  origin  is  the  solution 
point  for  the  problem.  The  task  is  to  find  this  point 
closest  to  the  origin. 

Although,  the  origin,  x^  =0,  is  not  a  feasible  point. 
Shank land  treats  it  as  such  for  the  moment.  Using  x^,  a 
subset  of  constraints,  V,  is  formed  from  the  violated 
constraints.  This  subset  of  constraints  is  solved  as 
equality  constraints.  If  the  feasible  region  defined  by  V  is 
non-empty,  a  Lagrange  multiplier  technique  yields  the 
solution  point.  If  the  feasible  region  defined  by  V  is 
empty,  a  generalized  Inverse  obtains  a  solution  In  terms  of 
least-squares.  The  least-squares  function.  Ce''3%'',  arises 
from: 
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ca.  17D 


c’'x  -  d  =  e''  O 

not  equal  to  zero  since  the  region  is  inconsistent  in  terms 
of  the  intersection  of  the  constraints. 

This  least -squares  solution  might  be  improved  using  an 
iterative  refining  technique  C40;5-7D.  In  this  refining 
process,  violated  constraints  are  retained,  over -satisfied 
constraints  are  removed,  and  a  feasible  solution  obtained 
for  the  resulting  constraint  subset.  If  the  solution 
violates  some  other  constraintCsD .  the  process  is  repeated. 
If  the  refining  process  finds  no  feasible  solution,  the 
initial  least-squares  solution  is  retained  as  the  best 
solution  to  the  problem.  The  primary  feature  of  using  a 
generalized  inverse  is  that  the  1  east -squares  solution  is 
obtained  even  when  a  unique  solution  is  not  available  due  to 
inconsistency  in  the  constraint  set.  When  this  inconsistency 
occurs.  Shankland  states  that.  **the  constraints  are  mutually 
incompatible”  C40:7D. 

A  penalty  function  technique  is  an  optimization 
technique  that  transforms  a  constrained  optimization  problem 
into  an  unconstrained  optimization  problem.  The  basic 
concept  is  to  force  convergence  to  the  optimal  solution  by 
applying  increasing  penalties  for  not  satisfying  the 
constraints  imbedded  in  the  unconstrained  function.  Thus, 
there  is  a  trade  off  between  satisfying  the  constraints  and 
minimizing  the  objective  function  €13:299-3003. 

Fletcher  €28:223-2243  uses  the  generalized  inverse  to 
generate  a  penalty  function  from  the  original  constrained 
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problem.  Fletcher’s  technique  employs  the  gradient  of  both 
the  objective  function  CVfCxD  =  FD  and  the  constraints 
CVg  CxD  ND ,  as  well  as  the  Hessian  matrix  of  the  objective 
function  CHCxDD.  Then  to  ensure  proper  behavior  of  the 
function,  a  large  positive  definite  matrix.  Q.  is  added  to 
the  function  giving  an  unconstrained  function: 

^  =  F  -  NN'^F  +  g’‘Qg  Ca.  18D 

This  unconstrained  function  can  now  be  solved  using  any 
appropriate  unconstrained  optimization  technique.  Fletcher 
points  out  that  this  function: 

Ca3  is  suitable  for  a  variety  of  problems. 

CbD  is  well  conditioned,  and 

CcD  strongly  interfaces  to  the  classical  Lagrange 
method  of  multipliers,  as  well  as  other  penalty 
function  techniques. 

In  a  1965  article.  Charnes  and  Kirby  used  the 
generalized  inverse,  in  particular  the  inverse,  to  show 
"that  the  modular  design  problem  is  simply  a  special  case  of 
a  large  class  of  engineering  design  problems"  presented 
elsewhere  in  the  literature  C6:843I>.  This  special  case 
problem  is  the  separable  convex  function  subject  to  linear 
equality  constraints.  A  separable,  convex  function  can  be 
approximated  by  a  series  of  linear  functions  Ci.e. .  linear 
approximation? .  Such  approximations  enable  the  use  of  the 
more  efficient  linear  programming  packages  to  solve  the 
problems.  The  algorithms  specifically  designed  for  the 
modular  design  problem  are  often  complex  and  inefficient  so 


ao 


the  increased  efficiency  gained  from  the  approximations 
offset  the  effort  required  to  reformulate  the  problem. 

The  modular  design  problem  presented  by  Charnes  and 
Kirby  is  of  the  following  form  C 6: 8363: 

s.t.  ED  >R  Ci=l . m3 

I  j  ‘•J 

E.D  >0  CJ=1 . n3 

»•  J 

where  e  .  d  .  and  R  are  positive  constants.  Charnes  and 
>•  J 

Kirby  make  the  statement  that  the  algorithm  typically  used 
to  solve  these  types  of  problems  is  very  slow  in  converging 
to  the  optimum  C6:8373. 

Charnes  and  Kirby  use  a  series  of  transformations  to 
produce  an  equivalent  formulation  of  C2.  193  whose  properties 
of  convexity  and  separability  enable  use  of  the  linear 
programming  packages.  The  key  transformation  involves  the 
generalized  inverse  which  is  coupled  with  the  concept  of 
slack  variables  from  linear  programming.  A  key  aspect  of 
linear  programming  involving  linear  inequality  constraints 
is  that  slack  variables,  <a,  enable  the  following 
transf  ormation: 

A  X  >  b  Ax  =  b  +  w  C2. 803 

From  the  generalized  inverse  theory,  a  consistency 
condition  for  the  equation  A  x  *  b  to  have  a  solution  is 
that  A  A^  b  -  b  C 6: 8383.  This  concept  can  be  combined  with 
C8. 203  above  to  produce  the  following  equivalence 
relationship: 
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opt  FC5p  opt  FCxD  C2.21D 

s.t.Ax>b  s.t.  Ax  -b  +  oj 

A  A^Cb  +  coD  =  Cb  +  fa)D 
w  >  0 

Working  from  the  modular  design  formulation  C2. 19D,  the 
following  three  transformations  are  applied  to  the  initial 
problem  formulation; 

CID  y  =  e  E 

i  t  I 

z  .  =  d  D 
J  J  J 

r  =  e  d  R 

<•  J  >■  J  >-J 


i 


c  .  =  1  nC  r  3 
vj  vj 

after  which  the  equivalence  relation  of  C2.21D  is  used  as  a 
third  transformation. 

These  transformations  change  the  problem  formulation 
according  to  the  sequence  in  Figures  2. a  through  2. d. 
Although  the  Figure  2. d  formulation  may  be  solved  for  the  « 
values,  the  formulation  is  extended  to  the  final  form. 

Figure  2.  e,  using  the  transformations; 

T  =  Cl  -  A  A'^^  C2.  22:) 

AA*Cb  +  =  b  +  <u  — i.^  Cl  -  A  A*)w  *  -CI-AA'^5b 

The  final  form  of  the  modular  design  problem,  given  in 
Figure  2.  e,  is  the  desired  separable  convex  function  subject 
to  linear  equality  constraints.  Though  use  of  the 
appropriate  transformations.  Charnes  and  Kirby  demonstrate 
the  similarity  between  the  modular  design  problem  and  linear 
programming  problems  making  particular  use  of  the 
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Ca) 

s.t.  ED  >0 

«•  J 

E  ,D  >  0 

«•  J 

min  y  y  y  z 
=  Zj  =  i  j 

(B) 

s.t.  y  z  >  R 

>•  J  >•  J 

ml  n 

^  m  Ui  r*  n  Vj  .  ^  m  ^  n  Ut  +  Vj 

Z>.  =  i  Zj  =  i  Zv  =  iZj=i 

Cc) 

s.t.  U  +  V  >  c 

t  J  »•  J 

,  r>mV”  Cc.  +o)  5 

min)  )  e  tj  vj 

Co) 

s.t.  A  A^Cc  +  Ci>5  =  Cc  uO 

A#  A# 

ti>  >  0 

min  ^ 

'  m  V  Cc  .+  0)  .3  ^  m  V  n  (0 

1.  ).  e  VJ  VJ  =  >.  ).  r.  ,e  vj 

*=l^J  =  l  ■’  ^  ^V  =  t^J  =  i  VJ 

s.  t. 

0)  —  0)  +0)  +fa>  = 

ii  t,l  kft.t 

Ce) 

-c  +  c  +  c.  -c, 

tt  <,l  k*t.t  k*t,l 

k  =1 , .  .  .  ,  m-1 

1.  ”2 »  .  .  .  » n 

(0  >  O 

«■  i 

Figure  2.  Evolvimo  Modular  Design  Formulation 
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generalized  inverse  matrix. 

Commcin  Solutions.  The  last  application  area  isn’t  so 
much  an  application  as  it  is  a  demonstration  of  the  trend 
regarding  applications  of  generalized  inverses.  This  trend 
is  the  move  from  matrices  with  constant  coefficients  to 
matrices  Involving  polynomial  elements.  These  matrices  with 
polynomial  elements  are  called  mul ti parameter  matrices.  An 
understanding  of  this  trend  demonstrates  future  applications 
of,  and  research  into,  generalized  matrix  inverses. 

In  Penrose’s  original  work,  he  presented  necessary  and 
sufficient  conditions  for  solutions  to  exist  to  a  problem  or 
set  of  equations  C32:4092>.  In  1972.  Mitra  discussed  the 
simultaneous  solution  of  two  matrix  equations  C24:>,  unaware 
that  a  more  generalized  discussion  was  presented  by  Morris 
and  Odell  in  1968  C26D.  In  their  earlier  article,  Morris  and 
Odell  proved  conditions  for  a  common  solution  to  n  matrix 
equations.  Also  in  1972,  Shurbet  C415  defined  the  necessary 
and  sufficient  conditions  for  the  consistency  of  a  system  of 
linear  matrix  equations.  This  work  built  on  the  work  of 
Morris  and  Odell  and  advanced  the  theory  for  the  constant 
coefficient  matrices. 

With  developments  in  mul tl parameter ,  multidimensional 
systems,  there  arose  the  need  to  do  similar  work  with  the 
generalized  inverses  of  polynomial  matrices.  The  first 
attempt  to  study  these  mul ti parameter  matrices  was  in  1978 
by  Bose  and  Mitra  C4:493.  Their  article  provided  necessary 
and  sufficient  conditions  regarding  the  algebraic  structure 
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of  the  multi parameter  matrix.  In  an  extension  of  the  work  of 
Bose  and  Mitra>  Sontag  gave  a  complete  characterization  of 
the  weak -gener al i zed  inverse  CA  D  for  matrices  involving 
several  polynomial  elements  C4SD. 

Later  work  by  Jones  in  1983  C18D  and  1985  C205  extended 
the  theory,  providing  necessary  and  sufficient  conditions 
for  the  solution  of  several  types  of  matrix  equations. 

Jones’  work  examined  much  of  the  work  already  done  for 
constant  coefficient  matrices  and  extended  the  theory  to  the 
case  of  multi  parameter  matrices.  Finally,  to  complete  the 
cycle  of  research,  Jones  extended  the  work  of  Morris  and 
Odell  regarding  common  solutions  to  sets  of  matrix  equations 
to  the  area  of  multi parameter  matrices  C17D.  In  his  work, 
Jones  provided  the  necessary  and  sufficient  conditions  for 
the  existence  of  common  solutions  of  n  multi  parameter  matrix 
equations. 

Comoutati on 

The  last  topic  in  this  section  concerns  methods  for 
computing  the  generalized  inverse.  As  previously  stated, 
Murray  cites  Jones’  ST  method  for  computing  all  generalized 
Inverses  as  the  last  significant  contribution  to  the  theory. 
The  purpose  of  this  section  is  to  present  some  of  the  other 
computation  techniques.  The  ST  method  is  presented  in  detail 
in  the  next  chapter. 

Penrose’s  Method  C 30: 208-2093 .  The  essence  of  this 
technique  relies  on  the  ability  to  partition  a  matrix  A. 
having  rank  of  r,  into  the  following  form: 
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A 


ca.  333 


[:•  :■) 

'•8  4 


\ri^ere  A  is  of  dimensions  rxr.  A 

1  4 


A  A~^A  and  A  and  A 
S  2  2  3 


are  of  suitable  order  such  that  the  matrix  A  remains  an  mxn 
dimensional  matrix.  From  this  partition,  define  P  as: 

Ca.  343 


P  =  CA  a''  +  A  a''3"*A  Ca’‘a  +  a'^A  3”‘ 

il  2  2  111  33 


and  then 


a"  = 


a^'pa’' 
1  1 


a’^pa’^ 
2  1 


aVa 

1  3 


a'^’pa^ 

2  3 


ca.  353 


Though  straightforward,  this  method  relies  upon 
knowledge  of  the  rank  of  the  matrix  A  and  the  computation  of 
classical  matrix  Inverses. 

QS  Decomposition.  This  method  depends  upon  decomposing 
the  matrix  A  into  the  product  of  two  matrices,  Q  and  S  so 
that: 

A  =  Q  S  ca.  363 

where  Q  has  orthogonal  columns  and  S  is  an  upper  triangular 
matrix  Ca8:a853.  This  decomposition  Is  then  used  to  obtain: 

ca.  373 


A^  =  CS  S’’3~‘  q’' 


The  recommended  numerical  technique  for  accomplishing 
the  computations  In  Ca.a73  is  to  solve: 

CS  S^3  X  =  CT  ca.  383 

and  form  the  product: 


*  X 


ca.  393 


ae 


i 


which  gives  the  desired  A*  generalized  inverse  C28:286Z>. 

A  somewhat  related  technique  uses  a  different 
decomposition,  namely: 

A  =  L  U  Ca.  305 

where  L  and  U  are  lower  and  upper  diagonal  matrices, 
respectively.  Though  this  decomposition  technique  Is 
numerically  cheaper  to  perform  than  the  decomposition  of 
C2.  265,  computing  the  A^  from  the  L  and  U  matrices  of  C2.  305 
involves  more  operations  than  using  operations  C2. 275 
through  C2. 295.  Details  can  be  found  comparing  each 
technique  in  an  article  by  Noble  in  Nashed’s  volume  on 
Generalized  Inverses  C 28:  285— 2885 .  Numerical  techniques  for 
each  decomposition  fall  under  the  headings  of  OR 
decomposition  C31; 315-3235  for  C2. 265  and  LU  factorization 
for  C2.305  C5:  342-3505. 

Direct  Computation.  Given  an  mxk  matrix  A,  of  known 
rank  r,  the  most  straightforward  technique  for  computing  A* 
is  by  the  following  formula: 

A*  =  Ca’’a5"*  a'*'  C2.  315 

which  Is  the  familiar  least-squares  solution  of  linear 
equations  and  linear  regression.  The  actual  computation  of 
C2. 315  can  be  accomplished  by  numerically  solving  the 
following  for  the  matrix  X  C 28: 2785: 

CA^A5  X  =  A^  C2. 325 

The  problem  with  C2.  315  and  C2.325  arises  vdien  the 
matrix  A  contains  linearly  dependent  rov«  Cor  columns5.  This 
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causes  a  singular  matrix  CA^A^.  which  causes  C2. 313  to  fail, 
and  worsens  the  numerical  computation  of  C2. 323  C28:2793. 

For  example,  compute  A^.  using  equation  C2. 313.  for  the 
matrix  A  defined  as: 


A  = 


1  1 
1  1 
1  1 


The  product  A  A  is  the  following: 


a’'a  = 


i:  :) 


C2.  333 


C2.  343 


T 

which  is  singular  Implying  CA  A3  doesn't  exist.  The 
previously  discussed  decomposition  methods  Improve  the 
conditions  for  computing  the  generalized  inverse.  Thus  the 
decomposition  methods  are  recommended  over  this  particular 
direct  method  C28:2843. 

Recent  advances  in  computer  algebra  enable  researchers 
to  expand  into  symbolic  computations.  Computer-based  expert 
systems  such  as  REI3UCE.  MAPLE.  SMP,  muMATH.  and  in 
particular  MACSYMA  C50: 33  provide  such  symbolic 
computational  environments.  Sample  manipulations  are  limits 
and  integrals.  Frawley  Cll:4193  uses  a  limit  form  of  C2.  313 
to  compute  generalized  inverses  in  MACSYMA.  The  form  used 
is: 

A"^  =  lim  CC  a'*’a  +  X*I3"‘a’‘3  C2.  353 

xeo 

Si ngul ar  Val ue  Decomposl ti on .  The  singular  value 
decomposition  method  C31;  323-3303  is  some%d^at  more 
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complicated.  The  method  makes  use  of  the  eigenvalues  and 
eigenvectors  of  a  matrix.  A  key  idea  is  that  a  matrix  of 
orthonormal  eigenvectors  of  a  matrix  A  can  be  used  to 
decompose  A  into  a  diagonal  form  where  the  diagonal  elements 
are  the  eigenvalues  of  A.  If  M  is  this  matrix  of 
eigenvectors,  this  decomposition  is  given  in  the  following 
equati on: 

M~‘aM  =  Z  C2.36D 

where  Z  is  the  diagonal  matrix  of  eigenvalues. 

The  technique  for  finding  A*  C31:326!5  Involves  the 
orthonormal  eigenvectors  of  the  matrices  Aa’^  and  a’^A,  as 
well  as  their  eigenvalues,  which  are  equal.  The  matrix  V 
contains  the  eigenvectors  of  a’^A  and  the  matrix  U  contains 
the  eigenvectors  of  Aa'^.  A  diagonal  matrix,  Z,  is  defined  as 
before  in  C2.36!>.  The  equation  for  A^  is  then  C31:337D: 

A*  =  V  Z“‘u'*’  C2.  375 

Although  very  straight  forward  computationally,  the 
problem  with  the  method  involves  finding  the  eigenvalues  and 
eigenvectors  of  the  matrices.  Numerical  computations  to  find 
the  eigenvalues,  and  the  corresponding  eigenvectors,  of  a 
matrix  can  introduce  error  into  the  computations  as  well  as 
require  significant  computer  resources. 

Other  Techniques.  As  previously  mentioned,  the  A^ 

matrix  is  actually  a  unique  member  of  a  class  of  generalized 

inverses.  Further,  mention  was  made  of  the  fact  that  a 

lesser  inverse,  such  as  the  A  or  A  inverse  may  suffice 

11,*  ^ 
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in  some  applications.  Thus,  there  are  techniques  for 
computing  these  subsets  of  matrices. 


One  technique  to  compute  the  A^,  or  the  matrix  is 

similar  to  Penrose’s  technique  C 31: 2083.  To  compute  the  A^, 

partition  the  matrix  A  such  that  A  =  C  |  3  ,  where  the 

dimensions  of  the  submatrix  B^  are  determined  by  the  rank  of 

A,  and  the  dimensions  of  B  are  appropriate  for  the  matrix. 

z 

Using  these  submatrices,  the  generalized  inverse,  A^ ,  is 
computed  as: 


A  = 
1 


CB'*‘b  3‘‘b‘*‘ 

11  1 


C2.  383 


The 
f ashi on. 


A^  generalized  inverse  can  be  computed 
The  matrix  A  is  partitioned  so  that 

r  ^ 


A 


C 


1 

C 


in  a  similar 


C2.  393 


where  as  before  the  dimensions  of  the  matrix  C  is  determined 
by  the  rank  of  the  matrix  A.  The  dimensions  of  are  again 
appropriate  for  the  matrix.  The  formula  for  the  A^ 
generalized  inverse  is: 


A  =  f  CC’‘C  3"*C''  O 

Z  (^11  1 

The  observant  reader  will  note  the  similarity  of  the 
previous  formulas  with  equation  C2.313  and  the  least-squares 
estimator.  For  full-rank  matrices,  this  technique  is  a 
special  case  of  the  direct  computation  method.  The  drawback 
with  the  technique  is  the  need  to  predetermine  the  rank  of 


C2.  403 
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the  matrix  A.  There  is  also  the  need  to  permute  the  matrix  A 
to  obtain  the  proper  partitioning. 

One  final  technique  is  worth  noting.  This  technique, 
due  to  Urquhart  C47D,  starts  with  any  technique  to  compute 
the  A^  matrix.  Using  this  technique,  the  following  matrices 
are  computed  for  the  matrix  A: 

B  =  caa'^d  c  =  ca’‘a5  C2.41:> 

1  i  1  1 

These  additional  matrices,  the  original  matrix  A,  along 
with  a’^,  are  then  used  to  obtain  representatives  from  each 
set  of  generalized  inverses.  The  following  formulas  compute 
these  generalized  inverses: 


A 

=  A  AA 

C2.  42D 

1.2 

1  1 

A 

=  a’’b 

C2.  433 

1,2,3 

1 

A 

=  c  a’’ 

C2.  443 

1.2,* 

1 

A  =  A*  =  a'''b  AC  a’^  C2.45D 

1.2.3,4  1  1 

Each  of  the  techniques  discussed  make  explicit  use  of 
the  predetermined  rank  of  A  and  use  some  form  of  matrix 
decomposition.  Iterative  methods,  that  converge  to  the 
Moor e-Penr ose  generalized  inverse  CA^D,  were  not  addressed 
in  this  review.  Since  an  iterative  method  converges  to  A*, 
Ideally  A*  must  be  known  to  say  with  certainty  that  A^  ^  A*, 
where  A^  is  defined  as  the  intermediate  values  of  the 
inverse. 

Standard  computer  software  packages,  such  as  IMSL, 
EISPACK,  and  LINPACK  contain  routines  for  the  generalized 
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inverse  CS8:S98D,  but  none  of  these  use  Gauss-Elimination  as 


the  basis  of  the  computations.  These  routines  also  cannot 
handle  matrices  with  [xslynomial  elements,  but  then  %»re  not 
really  designed  to  do  so.  The  technique  used  in  this  thesis, 
Jones*  ST  method,  uses  Gauss-Eli mi nation,  determines  the 
rank  of  the  matrix  A.  generates  all  generalized  inverses  of 
A,  and  is  very  applicable  to  multi  parameter  matrices.  The 
algorithm  is  easy  to  understand  and  is  numerically  and 
computationally  efficient  C27:87D.  The  theory  behind  the 
technique  as  well  as  a  description  of  the  algorithm  are 
discussed  at  length  in  the  next  chapter. 

Concl  usi  on 

This  chapter  examined  the  published  knowledge  regarding 
the  theory,  application,  and  computation  of  generalized 
inverses.  The  applications  section  highlighted  the  diversity 
of  the  field  in  applying  generalized  inverses.  The 
minimum-norm,  1 east -squar es  property  of  A^  make  the  inverse 
valuable  in  statistics  as  well  as  in  optimization  theory.  A 
big  benefit  is  that,  since  a  generalized  Inverse  exists  for 
all  matrices,  mathematical  problem  formulation  is  not 
limited  to  Just  using  square,  non-singular,  matrices. 


Ill .  Theory  and  Background 


I ntroduction 

As  chapter  two  highlighted,  the  theory  of  generalized 
inverses  has  touched  a  wide  range  of  disciplines.  This 
chapter  presents  the  theoretical  groundwork  of  this  thesis 
effort,  generalized  inverses  of  multi parameter  matrices. 

This  is  a  new  area  of  research,  sparked  by  the  growing 
complexity  of  modern  systems.  Theory  regarding  constant 
coefficient  matrices  falls  short  in  solving  current 
problems.  Computationally,  expert  systems  such  as  MACSYMA 
enable  efficient  manipulation  of  variable  element  matrices 
and  provide  exact  answers  to  complex  problems.  Thus,  with 
theory  and  computational  tools  available,  the  application  of 
multi parameter  generalized  inverses  can  progress. 

The  theory  presented  here  has  emerged  from  the  constant 
coefficient  matrix  theory.  Most  of  the  theorems  have  been 
proved  elsewhere,  so  the  source  of  the  proof  is  provided  as 
a  reference.  The  purpose  of  this  chapter  is  to  provide  an 
understanding  of  the  theory  behind  generalized  Inverses. 

This  insight  comes  from  the  contents  of  the  theorem,  not 
necessarily  from  the  proof  of  that  theorem.  The  intent  then 
is  to  consolidate  that  theory  at  the  very  core  of  this 
research  effort. 

Constant  Coefficient  Matrices 

The  basic  conditions  a  matrix  must  satisfy  for  the 
matrix  to  be  a  generalized  inverse  were  set  forth  by  Penrose 


in  1954.  Penrose’s  purpose  in  considering  these  Inverses  was 
to  solve  inconsistent  linear  equations,  those  Involving 
singular  and  rectangular  matrices,  cases  where  classical 
matrix  theory  fell  short.  These  conditions,  now  referred  to 
as  the  Penrose  conditions,  come  from  the  following  theorem: 

Theorem  3.  1 :  C 32: 4065  The  four  equations: 


C15 

A  X  A 

=  A 

C3. 15 

C25 

X  A  X 

=  X 

C3.  25 

C35 

CA  X5* 

=  AX 

C3.  35 

C45 

CX  A5* 

=  X  A 

C3.  45 

have  a  unique  solution  for  any  matrix  A. 

Proof:  See  cited  reference.  Numbering  of  the  equations 

added  for  future  reference. 

In  the  course  of  his  proof,  Penrose  showed  that  the 
matrix  A  did  not  necessarily  have  to  be  a  square  matrix. 
Since  the  classical  Inverse  from  matrix  theory  covers  only 
non-singular,  square  matrices,  Penrose  said  his  inverse  was 
a  generalization  of  the  notion  of  a  matrix  inverse.  He 
called  his  inverse  a  pseudoinverse  and  designated  it  as  A^. 
Along  with  the  proof,  Penrose  provided  tw  key  lemmas.  These 
are: 

♦ 

Lemma  1.1:  A*  =  A 

Lemma  1 . 2:  If  A  is  a  non-singular  matrix,  then  A"^  = 

A"‘  C  32:  4085. 

With  Lemma  1.2,  Penrose  tied  together  the  notion  of  a 
generalized  Inverse  with  the  classical  inverse  theory  he 
sought  to  generalize.  Thus,  working  with  the  A^  Inverse  did 
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in  fact  provide  for  a  more  general  methodology  than  using 


just  the  A~*  inverse. 

The  next  theorem,  and  lt*s  associated  corollaries,  gave 
Penrose  the  ability  to  solve  all  types  of  linear  equations, 
both  consistent  and  inconsistent. 

Theorem  3. 2.  C3S:4C>9D  A  necessary  and  sufficient 
condition  for  the  equation 

A  X  B  =  C  C3.  5D 


to  have  a  solution  is 

A  A*  C  B*  B  =  C  C3.65 

in  which  case  the  general  solution  is 

X  =  A*  C  B"^  Y  -  A*  A  Y  B  B*  C3.  75 

where  Y  is  arbitrary. 

Proof:  See  cited  reference. 

Corollary  3.  2.  1 .  The  general  solution  of  the  vector 
equation: 


P  X  =  C 


C3.  85 


is 

X  =  P*  C  +  Cl  -  P*  P5  Y  C3.95 

where  Y  is  arbitrary,  provided  that  the  equation  has  a 
sol  ution. 

Corollary  3. 2. 2.  A  necessary  and  sufficient  condition 
for  the  equations: 


A  X  =  C  C3.  105 

X  B  -  D 

e* 

to  have  a  common  solution  is  that  each  equation  should 
Individually  have  a  solution  and  that 
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A  D 


C  B 


C3.  IID 


Proof:  See  cited  reference. 

These  then  are  the  pertinent  results  from  the  classical 
work  of  Penrose.  Each  forms  the  basis  for  future  work,  as 
shown  throughout  the  remainder  of  this  chapter.  It  should  be 
noted  that  soon  after  the  publication  of  Penrose's  work,  it 
became  evident  that  the  four  Penrose  conditions  were 
equivalent  to  earlier  work  done  by  Moore.  In  his  earlier 
work,  Moore  defined  the  generalized  inverse,  6,  of  a  matrix 
A  as  satisfying; 

AG  =  P  C3.  135 

A 

GA  =  P 

a 

where  P^  is  defined  as  the  orthogonal  projection  onto  the 
column  space  of  the  matrix  X  *  G  or  A  C28:xi-xii5.  Thus,  the 
A*  inverse  is  generally  referred  to  as  the  Moore-Penrose 
generalized  inverse  C28:1115. 

The  work  of  Penrose  laid  the  groundwork  for  later 
advances  in  generalized  inverse  theory.  However,  to  properly 
understand  some  of  the  later  work,  particularly  the  vrark  of 
Jones  and  others  with  the  ST  computational  method,  as  well 
as  the  connection  betvreen  the  early  work  of  Moore  and 
Penrose,  one  must  first  understand  some  fundamental  ideas 
from  linear  algebra.  In  particular  there  is  Strang's 
discussion  of  the  four  fundamental  subspaces  associated  with 
a  matrix  and  extensions  of  this  work  Into  the  mul tl F>sr ameter 
matrix  area.  Strang  develops  these  concepts  In  the  form  of 
two  theorems,  t4>lch  he  labels  as  Fundamental  Theorems  of 
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Linear  Algebra.  From  Strang  C46:75D: 

Fundamental  Theorem  of  Linear  Algebra.  Parts  I  and  II 

1.  (RCA^3  =  row  space  of  matrix  A;  dimension  r 

2.  t)CAD  =  null  space  of  matrix  A;  dimension  n-r 

3.  IRC  AD  =  column  space  of  matrix  A;  dimension  r 

4.  rjCA^D  =  left  null  space  of  matrix  A;  dimension  m-r 

5.  17CAD  =  CRCa'^DD'^  6.  IRCa‘*’d  =  nCAD"^ 

7.  17Ca''d  =  CRCADD^  8.  RCAD  =  Ct>Ca’‘dD'^ 

The  above  theorem  says  that  associated  with  any  matrix 
A  there  are  four  fundamental  subspaces,  and  these  subspaces 
are  related  according  to  the  orthogonal  complement 
relationships  depicted  in  5  through  8.  Although  Strang  aimed 
his  theorem  at  matrices  with  constant  coefficients,  the 
results  are  Just  as  valid  for  matrices  defined  over  the 
polynomial  field. 

For  the  equation  Ax  =  b.  the  idea  of  subspaces  is 
critical.  For  a  solution  to  exist,  the  vector  b  must  lie 
within  the  column  space  of  A.  In  other  words,  x  is  a  linear 
combination  of  A.  Those  linear  combinations  of  x  that 
satisfy  the  homogeneous  equation  Ax  =  O  are  members  of  the 
nullspace  of  the  matrix  A.  This  nullspace  is  also  referred 
to  as  the  nullity  or  the  kernel  of  the  linear  transformation 
provided  by  the  matrix  A  C1:216D. 

The  idea  of  rank  of  a  matrix  must  also  be  understood. 
The  rank  of  A  is  the  number  of  linearly  independent  vectors 
that  span  the  column  Cand  rowD  space  of  the  matrix.  A  more 
traditional  definition  is  that  the  rank  of  the  matrix  A  is 
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the  dimension  of  the  smallest  nonsingular  submatrix  of  A.  If 
A  is  nxn.  and  the  rank.  r.  equals  n.  then  A  has  full  rank. 

If  A  is  mxn  and  m^n,  then  A  is  not  full  rank.  Only  full  rank 
matrices  have  a  classical  Inverse.  A~^,  and  provide  a  unique 
solution  vector,  x.  to  the  equation.  Ax  =  b. 

For  full  rank  matrices,  the  homogeneous  equation  Ax=0 
is  satisfied  only  by  the  trivial  solution.  x=0.  The 
null space  consists  of  this  single  point.  In  singular 
matrices,  the  null space  is  of  dimension  n-r .  or  m-r . 
depending  upon  the  rank  of  the  matrix.  The  solutions  of  Ax=0 
are  non-trivial  and  form  a  non-trlvial  subspace,  the 
null space,  that  is  orthogonal  to  the  column  space,  which  is 
of  dimension  r.  the  rank  of  A. 

Looking  at  Ax=b.  a  solution,  x,  can  be  found  if  b  is 
orthogonal  to  the  null space  and  is  a  member  of  the  column 
space.  But  if  the  equation  is  inconsistent  then  the  A~*' 
inverse  does  not  exist  and  the  solution  is  no  longer  unique. 
The  best  solution  to  the  problem  must  then  be  selected 
from  among  the  possibly  infinite  number  of  solutions.  This 
turns  out  to  be  the  point,  call  it  t.  that  is  in  the  column 
space  of  A  and  is  closest  to  the  point  b.  This  solution 
is  deemed  “best”  since  it  Is  the  closest  solution  among  all 
possible  solutions.  Typically  least -squares  or  minimum-norm 
criteria  are  used  to  determine  the  closest  solution. 

This  is  where  the  generalized  Inverse  comes  into  play, 
as  alluded  to  in  Chapter  II  by  Penrose's  observation  that  A^ 
provides  the  least -squares,  minimum-norm  solution.  The  A^  is 
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a  projection  of  b  onto  the  column  space  of  A.  In  terms  of 
least-squares ,  x  =  Ca'^AD  *Ab,  is  the  projection  of  b  onto 
the  X  in  the  column  space.  This  projection  is  accomplished 
by  the  Ca'^AD'^A  term,  which  as  shown  in  Chapter  II,  can 
sometimes  be  used  to  find  the  A*  inverse.  Thus,  the 
reasoning  behind  the  claim  that  A^  provides  the  minimum-norm 
or  1  east -squares  CclosestD  solution  to  the  inconsistent 
pr  obi  em. 

Since  the  row  space  and  null space  are  orthogonal 
complements,  any  solution  vector  consists  of  two  portions. 
One  portion  is  a  projection  onto  the  row  space,  the  other  is 
the  projection  onto  the  nullspace.  This  solution  vector  can 
thus  be  written  as  x  =  Cx  +  wD .  Here  x  is  the  row  space 
component  and  w  is  the  nullspace  component.  Strang  C46:138:) 
points  out  that  any  solutions  to  Ax=b  will  share  a  common  x^ 
and  differ  only  in  the  nullspace  component,  which  is  the 
solution  to  the  homogeneous  equation  Ax=0.  This  homogeneous 
portion  can  also  be  expressed  in  a  general  form  as  x  =  Cl  - 
A*AJz,  for  arbitrary  z  C  36:  £3-26  ;  34:35D. 

Taking  these  ideas  into  account,  Strang  states  the 
following  conclusion,  which  Is  found  embodied  in  the  results 
of  Corollary  3.2.1  C46:1383: 

The  general  solution  is  the  sum  of  one  particular 
solution  Cin  this  case  x^D  and  an  arbitrary 
solution  z  of  the  homogeneous  equation 

Multi parameter  Matrices 

The  first  attempt  to  study  generalized  Inverses  of 
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multi parameter  matrices,  was  by  Bose  and  Mitra  C4:4913. 

Their  motivation  for  delving  into  this  new  area  was  the 
study  of  multi -input/  multi -output  control  systems.  These 
systems  often  require  the  use  of  matrices  having  elements 
that  are  not  constant,  but  variable.  Thus,  Bose  and  Mitra 
sought  to  extend  the  extensive  work  already  done  for 
constant  matrices  into  the  multi  parameter  matrices  defined 
over  rings  of  polynomials  of  a  single  variable  C42:514:>. 

Theorem  3.  3.  C4:491D  Any  Cm  x  nD  integer  matrix 

having  rank  r  will  have  an  integer  matrix  for  its 
generalized  Inverse  if  and  only  if  A  can  be  expressed  in  the 
Smith  canonical  form  Csee  Appendix  A  f or  a  definition  of 
Smith  fornO; 

A  =  M  D  N  C3.13:> 

where  M  and  N  are  integer  matrices  with  determinant  equal  ±1 
and  D  is  of  the  form: 


D  = 


C3.  145 


I^,  being  the  identity  matrix  of  order  r. 

Proof:  See  cited  reference. 

Bose  and  Mitra  use  this  theorem  to  extend  the  notions 
to  multi parameter  matrices  with  the  following  theorem: 

Theorem  3.  4.  C4:49S^  Any  Cm  x  n3  polynomial  matrix 

ACz^,  of  rank  r  with  coefficients  in  a  number  field,  will 
have  a  polynomial  matrix  Cwlth  coefficients  in  the  same 
fields  for  its  generalized  inverse  if  and  only  if  ACzD  can 
be  expressed  in  the  Staith  canonical  form 
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ACzD 


MCzD  D  NCz!) 


C3.  15:) 


where  MCzD  and  NCzD  are  polynomial  matrices  with  determinant 
equal  ±1  and  D  is  of  the  form: 


D  = 


C3.  16:> 


being  an  Cr  x  rD  diagonal  matrix  of  constants  belonging 
to  the  chosen  number  field. 

Proof:  See  cited  reference. 

Bose  and  Mitra  use  reduction  to  Smith  normal  form  to 
characterize  the  "weak  generalized  inverse",  which  are  those 
inverses  satisfying  Penrose  conditions  CID  and  C2:>.  They 
also  addressed  Just  the  single  variable,  polynomial  matrix 
case.  Extended  results  were  obtained  by  Sontag  in  June  1980 
with  the  following  theorem: 

Theorem  3. 5.  C4S:514J  The  following  statements  are 
equivalent  for  a  matrix  A  -  ACz.z.....zJ  over  R  e 

1  Z  n 

CC  z  .  z . z  D  : 

1  Z  n 


aJ  A  has  a  weak  generalized  inverse  CWGIJ 

bJ  There  exist  square,  uni modular  Ci.e.  nonzero 

scalar  determinants  matrices  P  and  Q  defined 

over  R  such  that  A  =  P  A  Q  with 

o 


cS 


A 


o 


I  O 

r 

o  o 


C3.  17S 


wh«r9  is  the  identity  matrix  of  order  r  = 
rankC AS 

As  a  function  of  the  con^lex  variables 
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Proof: 


Cz  ,z  ,...,z  D,  the  rank  of  AC z  , z  ..... z  D  is 
1  2  n  1  2  n 

constant. 

See  cited  reference. 

Theorem  3.  6  C42:516D  TTie  following  statements  are 
equivalent  for  any  matrix  A  over  R 

a;>  A  has  a  generalized  inverse 
bD  A  has  a  weak  generalized  inverse 

c3  A  has  constant  rank  over  all  Cz  .z  ..  . .  .z  D  in  R 

12  n 

dD  A  can  be  written  as  PA  Q  with  P  and  Q 

o 

uni  modular  R  matrices  C meaning  having 
determinant  not  equal  zero  for  all 
Cz  .z  .....z  D  in  R  and 

12  n 


A 


o 


C3.  18D 


with  being  the  identity  matrix  of  order  r  = 
rankC AD 

Proof:  See  cited  reference. 

Sontag  made  two  significant  advances.  First,  the 


results  were  now  extended  to  matrices  defined  over  R''. 
Secondly,  he  showed  a  generalized  inverse  in  fact  forces  the 
existence  of  a  Smith  form  for  the  original  matrix.  Recall 
the  work  of  Bose  and  Nitra  where  the  Smith  form  implied 
existence  of  the  generalized  inverse.  The  question  Sontag 
faced  was  how  to  determine  the  P  and  Q  matrices  that  perform 
the  necessary  transformations. 

Sontag *s  Factorization 

Sontag  used  a  full -rank  factorization  of  the  matrix  A 
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C42:516^  to  derive  a  formula  for  A*.  Recall  a  factorization 
of  A  requires  two  matrices.  B  and  C,  such  that  A  =  B  C 
for  B  e  C"*",  C  e  C’'*"’  and  rankCAD  =  r  C36:5D.  But  clearly  B 
and  C  function  in  the  same  role  as  do  the  P  and  Q  matrices 
referenced  by  Sontag.  There  is  still  the  need  to  determine  P 
and  Q.  Another  method  C31:326I>.  namely  the  singular  valued 

•f 

decomposition,  employed  the  matrix  M  of  eigenvectors  of  A  A 

T  ^ 

and  A  A.  The  Moor e-Penrose  inverse.  A  .  is  then  computed  by 
the  formula: 

a"^  =  M  Z"‘  M“‘  C3.  19D 

where  Z  *  is  the  inverse  of  the  matrix  Z  whose  diagonal 
elements  are  the  eigenvalues  of  A  a’^.  or  A.  The 

T  T 

eigenvalues  of  A  A  and  A  A  are  equivalent. 

ST  Method 

The  easiest  method  of  computation  involves  use  of  the 
ST  canonical  form,  along  with  extensions  of  work  from  Sontag 
and  classical  linear  algebra,  to  determine  the  P  and  Q 
matrices  while  reducing  the  A  matrix  to  it’s  Smith  form. 

This  method  is  the  ST  method  of  Jones  C7: 3-4  ;  27:viJ. 

I 

Extensive  detail  of  how  the  ST  method  is  implemented  can  be 
found  in  a  recent  AFIT  thesis  by  Murray.  Although  Murray 
considered  only  the  constant  coefficient  case,  the  technique 
remains  the  same  for  reducing  multi parameter  matrices.  A 
brief  esq^lanatlon  of  the  technique  is  followed  by  the 
underlying  theorems. 

Consider  A  e  C"”*”.  Augment  A  with  Identity  matrices 
below  and  to  the  right  to  obtain  the  following  form: 
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A 

I 


I 

11 

O 


C3.  20D 


where  I  e  and  I  e  This  is  referred  to  as  the 

n  m 

Initial  ST  canonical  form. 

Reduce  the  A  matrix  to  the  identity  matrix.  I  ,  where 

r 

the  dimensions  of  the  identity  matrix,  rxr.  are  equal  to  the 
rank  of  the  original  A  matrix.  Any  row  operations  jjer formed 
in  the  reduction  are  carried  out  on  the  augmented  matrix  to 
the  left.  Similarly,  any  column  operations  are  carried  out 
on  the  augmented  matrix  directly  below  the  original  matrix. 
Once  the  A  matrix  has  been  reduced  to  it’s  identity  form, 
the  augmented  form  is  now  in  the  final  ST  canonical  form: 


C3.  21  :> 


All  that  is  required  to  accomplish  this  initial 
reduction  are  elementary  transformations,  commonly  known  as 
elementary  row  and  column  operations.  Whether  the  matrices 
involved  are  constant  coefficient  or  mul ti parameter 
matrices,  the  elementary  transformations  remain  the  same. 
Appendix  A  contains  the  definition  of  these  OF>erations. 

If  A  is  full  row  rank,  the  M  submatrix  will  not  exist. 
If  A  is  full  column  rank,  the  N  submatrix  will  not  exist.  If 
A  is  both  full  row  and  column  rank,  then  A  has  a  A  ^  and  a 
trivial  nullspace,  the  point  zero.  In  this  case,  neither  the 
M  nor  the  N  submatrices  will  exist  in  the  final  ST  canonical 
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T 


forjn.  Tlie  rank  of  A,  rankCAD  =  r,  is  determined  as  a  result 
of  the  elementary  transformations  used  to  reduce  the 
augmented  form  C3.20D  to  the  reduced  canonical  form  C3.213. 

From  the  form  of  C3.  215,  the  P  and  Q  matrices  required 
by  Sontag  can  be  read  off  directly.  These  matrices  are; 


P  = 


Q 


(  S  I  N  ) 


C3.  225 


A  quick  check  of  a  reduced  matrix  will  verify  that  the 
product  P  A  Q  =  does  indeed  hold. 

From  the  form  of  C3.215,  all  the  generalized  inverses 
of  the  matrix  A  may  be  generated.  The  next  set  of  theorems 
prove  this  in  addition  to  proving  the  validity  of  the  above 
reduction  technique. 

Theorem  3.  7.  C9:23  ;  27:165  For  any  given  matrix  A 
<E  there  exist  two  nonsingular  matrices  P  e  c”*""  and  Q 

€  such  that 


• 

A 

I 

m 

and 

• 

I 

r 

0 

1  ^ 

I 

n 

0 

0 

o 

1  M 

s 

N 

1 

1, 

II 

are  equivalent. 

Proof:  See  cited  reference. 

Theorem  3.7  takes  the  results  of  Sontag's  work  CTheorem 
3.65  and  incorporates  it  into  a  computational  technique.  The 
P  and  Q  matrices  of  the  form  given  in  C3. 225  are  the 
matrices  required  by  Sontag  and  provide  the  weak  generalized 
inverse  of  the  original  matrix  A.  However,  the  real  strength 
of  the  ST  method  is  embodied  in  extending  the  P  and  Q 
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submal.  rices  in  accordance  with  the  Toll  owing  theorems. 

Theorem  3. 8.  C9:24D  From  the  matrices  defined  in 
theorem  3.7,  an  A  matrix  Cweak  generalized  inverse^  is 
determined  by  the  product  of  the  submatrices  S  and  T. 
Proof:  See  cited  reference. 

This  means  that  the  WGI  of  a  matrix  A  is  attainable 
simply  through  elementary  row  and  column  operations 
performed  on  the  augmented  form  given  by  C3. SOD.  Higher 
generalized  inverses  are  obtained  using  properties  of 
orthogonality.  In  particular,  the  rows  CcolumnsD  of  M  CND 
are  made  orthogonal  to  the  rows  CcolumnsD  of  T  CSD. 

Theorem  3.  9.  C9: 25-27  ;  27: 33-35D  From  the  matrices 
defined  in  theorem  3.7,  if  the  condition  T  =  O  holds 

then  an  A  ?natrix  generalized  inverse  is  defined  by  the 
product  S  T. 

Proof:  See  cited  reference. 

Theorem  3. 10.  C9:28;  27: 33-36D  From  the  matrices 

defined  in  theorem  3.7,  if  the  condition  S  =  0  holds 

then  an  A  matrix  generalized  Inverse  is  defined  by  the 
product  S  T. 

Proof:  See  cited  reference. 

The  final  theorem  in  this  set  of  theorems  comes  from 
the  work  of  Ooma  and  Murray,  who  combine  the  previous  tvn^ 
theorems  to  provide  the  conditions  under  v^ich  to  produce 
the  unique  Moor e-Penrose,  generalized  inverse,  the  A 
or  simply  the  A*. 

Theorem  3. 11 .  C9:29  ;  27:33-413  From  the  matrices 
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O  and 


defined  in  Theorem  3.7,  if  the  conditions  T  = 

S  *  O  hold,  then  the  product  ST  defines  the  matrix 
generalized  inverse. 

Proof:  See  cited  reference. 

The  ST  technique  can  be  summarized  by  the  schematic  in 
Figure  3.  The  matrix  A  gives  rise  to  the  initial  canonical 
form  by  augmenting  A  with  identity  matrices  below  and  to  the 
right.  Through  elementary  transformations  and 
orthogonalizations.  the  initial  canonical  form  is 
transformed  into  the  final  canonical  form.  During  the 
transformation  process,  each  of  the  A  ,  A  ,  A  , 

1.2  1,2.3  1,2,4 

and  the  A  generalized  inverses  can  be  computed. 

In  Corollary  3.2.1,  Penrose  gave  a  general  form  for  the 
solution  of  the  Ax  =  b  equation.  The  geometry  of  this 
solution  form  was  then  briefly  discussed,  based  in  large 
part  upon  Strang’s  work.  Presented  here  as  a  corollary  is  a 
result  of  the  subspace  concept  and  the  ST  reduction  process. 

Corollary  3.  11. 1.  C9:  39  ;  27:25  ;  7:40  ;  19:  463D  The 

equation  Ax  =  b  has  a  solution  x  if  and  only  if  Mb  =0.  In 
this  case  the  general  solution  is  given  by: 

X  =  CSnb  +  Nz  C3.  24D 

where  the  matrix  z  is  arbitrary,  and  the  S,  T,  M,  and  N 
matrices  come  from  the  final  ST  canonical  form  as  shown  in 
C3.  233. 

A  final  point  regarding  the  general  power  of  the  ST 
computational  technique  is  the  strong  interface  it  has  with 
the  Fundamental  Theorem  of  Linear  Algebra  previously 


Figure  3.  ST  Technkxje  Schematic 

discussed.  Once  fully  reduced,  the  final  ST  canonical  form 
provides: 

•  the  rank  of  the  matrix  A  Cdimension  of  I  ? 

r 

•  a  basis  for  the  column  space*  IRCA^»  of  A  given  in 


the  S  submatrix 


o  a  basis  for  the  row  space,  RCA  of  A  given  in  the 
T  submatrix 

o  a  basis  for  the  null  space.  7)CA3.  of  A  in  submatrix  N 

T 

a  a  basis  for  the  left  null  space.  t)CA  D.  of  A  in  the 
submatrix  M 

Each  of  the  above  are  byproducts  of  the  elementary 
transformations  and  orthogonalizations  performed  to 
determine  the  generalized  Inverses  of  a  given  matrix  A. 
Common  Solutions  of  Sets  of  Equations 

The  final  topic  addressed  in  this  chapter  involves 
necessary  and  sufficient  conditions  for  common  solutions  of 
matrix  equations.  These  sets  of  equations  arise  in  many 
applications,  for  instance,  network  design  problems  or 
critical  path  systems.  In  the  case  of  constant  coefficients 
and  mul ti parameter  systems,  parallel  processing  techniques 
can  be  exploited  to  determine  solutions  to  systems  more 
efficiently.  Current  hardware  and  software  technology  limit 
the  parallel  processing  applications  for  the  mul ti parameter 
case,  but  this  section  shows  that  the  theory  is  in  place. 

Common  solutions  to  sets  of  mul ti parameter  matrix 
equations  have  been  extended  from  the  work  done  on  constant 
coefficient  matrix  equations.  However,  the  details  of  the 
theorem  presented  are  provided  for  the  first  time.  Previous 
work  by  Morris  and  Odell  C20?  and  then  by  Jones  C17J  left 
these  details  out. 

Mitra  follo\tfed  Penrose’s  original  %iK3rk  with  an 
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extension  to  the  common  solution  of  tw  matrix  equations 
C24:2133.  In  particular,  Mitra  used  corollary  3.2.2  to  prove 
the  following: 

Theorem  3. 12.  C24:214D  Let  A  ,  A  ,  B  ,  and  B  be 

12  1  2 

non-negative  definite  matrices.  A  necessary  and  sufficient 
condition  for  the  consistent  equations: 

A  X  B  =  C 

11  1 

A  X  B  =  C  C3.  253 

2  2  2 

to  have  a  common  solution  is 

A  CA  +A  3‘  C  CB  +B  3‘  *  A  C A  +  A  3"C  CB  +B  3“B  C3.  263 

112  212  21  21122 

in  which  case  the  general  solution  is 

X=CA+A3"CC+Y+Z+C3CB  +  B3“+U  -  C3.  273 

12  1  2  12 

CA  +  A  3~  CA  +  A  3  U  CB  +  B  3CB  +  B  3 
12  12  1212 

where  U  is  arbitrary,  Y  and  Z  are  arbitrary  matrices 
satisfying  respectively  the  equations 

A  CA  +  A  3*  Y  =  A  CA  +  A  3"  C  C3.283 

2  12  112  2 

Y  CB  +  B  3”  B  =  C  CB  +  B  3*  B 

12  1  112  2 

and 

A  CA  +  A  3‘  Z  =  A  CA  +  A  3”  C  C3.  293 

112  2  12  1 

ZCB  +  B  3"  B  =  C  CB  +  B  3*  B 

1  2  2  2  1  2  1 

«4iere  the  C3  notation  denotes  an  A^  matrix  generalized 
inverse. 

Proof;  See  cited  reference. 

This  proof  provides  an  e^qpression  for  the  general 
common  solution.  However,  Mitra*s  work  was  limited  to  the 
case  of  0^2  matrix  equations.  A  generalization  to  n  constant 
coefficient  matrix  equations  comes  from  the  %iK3rk  of  Morris 
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and  Odel 1  ini 968  C  26^ .  Thi s  was  then  extended  to 


multi parameter  matrices,  defined  over  the  ring  of  polynomial 
elements,  by  Jones  C17J  in  1987.  Since  both  theorems  are 
similar  in  content,  only  the  latter  is  presented. 

Theorem  3.13.  C17:768D  Let  A  e  C’”'"  and  B  e  for 

-  I  V 

i  =  1 . m.  Define  the  following  relationships: 


C  =  A 

1  1 


D  =  B 

1  1 


C3.  30D 


E  =  A  B 
1  11 


F  =  I  -  A  A 
1  11 


and 


C,.=  F 
k  k  K-  1 


D  =  B  -  A  E 

K  K  K  K-1 


E  =  E  +F  CD  F  =  F  CI-CC5 

K  lC-1  K-lKK  K  K-t  KK 

Then  A, x  =  B  ,  for  i=l,.  .  .  ,m,  has  a  common  solution  if 
1-“  ». 

and  only  if  C.  C”  D.  =  D  for  i  *1  ,  .  .  .  ,  m.  In  this  case  the 

V  I.  t  V 


general  common  solution  is  given  by 


X  =  E  +  F  z  C3.  313 

~  n  n  ~ 

where  z  is  arbitrary. 

Proof:  See  cited  references  for  general  proof.  Since 

neither  reference  provides  the  explicit  proof,  this  detailed 
proof  is  presented  in  Appendix  B.  The  proof  is  a  double 
Induction  proof  in  that  both  the  conditions  for  existence  of 
common  solutions  and  the  defining  relationships  for  those 
solutions  are  proved  using  inductive  methods. 

Applications 

These  last  two  theorems,  3.12  and  3.13,  provide  some 
po%mrful  applications.  For  instance,  examine  the  following 
problem: 
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A  X  =  B  C3. 323 

for  A  e 

B  e  C***" 

If  each  row  of  the  matrix  equation  is  treated  as  a 
separate  matrix  equation,  then  the  results  of  Theorem  3.13 
applies  to  the  resulting  system  of  p  matrix  equations 
C 26: 2733.  This  system  may  then  be  solved  on  a  parallel 
processing  implementation  greatly  reducing  the  processing. 
Conclusion 

This  chapter  has  consolidated  much  of  the  theoretical 
knowledge  in  the  area  of  generalized  inverses  of  matrices. 
The  intent  has  been  to  provide  a  readable,  yet  thorough, 
presentation  of  the  underlying  foundations  for  this  thesis 
effort.  The  trend  towards  multi parameter  matrices  has  been 
clearly  defined  and  explained.  The  extensions  of  constant 
coefficient  matrix  theory  to  sets  of  equations  provides  a 
promising  area  of  research  Involving  parallel  processing. 
However,  it  is  the  multi parameter  trend  that  is  the  focus  of 
this  thesis.  Thus,  in  the  next  chapter,  some  particular 
examples  are  selected  and  solved  using  the  generalized 
inverses  of  multi parameter  matrices. 
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IV. 


I  ntroduct-i  on 

While  the  previous  chapter  laid  the  theoretical 
groundwork  for  this  thesis,  this  chapter  addresses  how  to 
use  the  generalized  inverse  of  a  multi parameter  matrix  as  a 
tool  to  solve  problems  arising  in  optimization.  The  history 
of  the  generalized  inverse  supports  the  trend  towards  more 
work  involving  multi parameter  matrices  as  a  necessity  to 
keep  pace  with  the  ever  increasing  complexity  of  today's 
problems.  The  increasing  capabilities  of  modern  computer 
systems  allow  researchers  to  investigate,  and  solve, 
problems  that  previously  took  months  to  solve  by  hand 
C50:S3.  Before  any  computer  solution  can  be  implemented 
however,  the  theory  and  technique  must  be  thoroughly 
established. 

Since  the  generalized  Inverse  plays  a  key  role  in  a 
diverse  range  of  disciplines,  a  small  cross-section  has  been 
selected.  However,  the  techniques  employed  are  generally 
applicable  to  many  other  areas.  In  addition,  a  couple  of 
"counter-examples'*  are  provided  in  examples  8  and  9.  These 
are  labeled  counter-examples  since  the  generalized  inverse 
technique  does  not  provide  a  value  for  the  optimal  solution. 
However,  though  the  optimal  solution  is  .not  found,  valuable 
information  regarding  the  function  can  be  obtained  from  the 
general  form  of  the  solution.  The  two  examples. 

Kantorovich’s  function  and  a  Lagrange  multiplier  problem, 
demonstrate  this  aspect  of  the  generalized  inverse  technique 


S3 


in  nonlinear  optimization. 


ComoutinQ  the  Inverse 

The  previous  chapter  discussed  the  ST  technique  for 
computing  generalized  inverses  of  matrices.  Before 
demonstrating  specific  optimization  examples,  it  is  best  to 
detail  the  workings  of  the  technique.  The  purpose  is  to 
demonstrate  the  applicability  of  the  technique  to 
mul tifsarameter  matrices,  and  to  demonstrate  the  steps  in  the 
algorithm.  Later  examples  leave  out  much  of  the 
computational  detail  to  conserve  space  and  enhance 
readability  of  this  report. 

Examol e  1 .  C21D  Compute  the  generalized  inverses  of 
the  following  multi  parameter  matrix: 


A  = 


x*+  1 


x*y 


C4.  ID 


X  y  +  xy 

Augment  this  matrix  with  Sx8  identity  matrices  below 
and  to  the  right  to  obtain  the  initial  ST  canonical  form: 


I.  ■■)  - 


C4.  2D 


The  A  matrix  portion  of  equation  C4. 2D  must  now  be 
reduced  to  an  identity  matrix.  The  ST  technique  requires 
the  use  of  elementary  transformations  to  accomplish  this 
task.  As  yet  r.  the  rank  of  the  matrix  A,  is  undetermined, 
but  is  computed  through  the  reduction  process  carried  out  on 
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the  A  matrix.  Recall  from  the  discussion  last  chapter  that 
any  row  operations  are  carried  over  to  the  matrix  augmented 
on  the  right.  In  a  similar  fashion,  any  column  operations 
are  carried  out  on  the  matrix  augmented  below  the  matrix  A. 

The  reduction  starts  by  multiplying  the  first  row  by 
the  polynomial  C-xyD.  and  adding  the  resulting  row  to  the 
second  row  of  the  matrix.  This  causes  equation  C4.2D  to 
transform  to: 


C4.  3D 


The  next  step  is  to  multiply  column  one  by  the 
polynomial  C-xD,  and  add  the  resulting  column  to  column  two. 
This  operation  results  in  the  matrix: 


The  upper  left  element  of  C4. 4D  must  equal  1.  The 
easiest  transformation  is  to  simply  interchange  columns  one 
and  two.  Once  interchanged,  the  final  transformation,  which 
cooqpletes  the  matrix  reduction,  is  to  multiply  the  new 
column  one  by  the  polynomial  C-xD  and  add  the  result  to 
column  t%i/o.  These  last  two  operations  produce  this  final 
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matri x: 


[l 

0 

1  0 

0 

o 

-xy  1 

-X 

1+x* 

1 

-X 

N  z' 

C4.  55 


>  T 

>  M 


The  final  result  of  the  elementary  transformations 

producing  C4.55  is  displayed  with  the  S.  N,  T.  and  M 

submatrices  appropriately  labeled.  Note  the  rank  of  the 

matrix  A  is  one  since  I  is  of  dimension  one.  From  this 

r 

form,  the  computations  of  Theorem  3.8  from  the  previous 
chapter  produce  the  following: 

-X  O 
1  O 


A  =  A  =  A  =  ST 

1  2  t.2 


-CD-  “-( 


C4.  65 


This  may  be  verified  by  using  Penrose  conditions  C15 

and  C25  of  Theorem  3.1.  To  obtain  the  A  Inverse, 

t.2,9 

Theorem  3.9  must  be  used.  This  theorem  requires  the 
orthogonality  of  the  vectors  comprising  the  T  and  M  matrices 
labeled  in  C4.55.  This  may  be  accomplished  using  a  modified 
Gram-Schmidt  process.  This  process  produces  an  updated 
canonical  form  matrix,  shown  here: 

C4.75 
>  T 


>  M 


N 
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From  the  final  ST  canonical  form  provided  in  C4.73, 

A  >  the  product  of  the  S  and  T  submftrices ,  is  found  as 

lAB  ^ 

a  result  of  the  following  computations: 


ST  = 


2  2  ^ 
X  y  + 


y  ♦  1  ■' 


C4.  83 


X  y 


2  - 

-X  -X  y 

1  xy  J 

To  obtain  the  A  generalized  inverse.  Theorem  3.10 

i.2,a 

requires  orthogonality  between  submatrices  N  and  S.  Once 
again  the  Gram-Schmidt  process  is  used  to  produce  the  matrix 
given: 


Cx*y*-H3 


«■ 

• 

1 

O 

1  0 

0 

o 

-xy  1 

X 

x*+3x* +2x 

Cx*+13*+  X* 

Cx*+13*+  X* 

x*+  1 

-Ax^-Ex* 

.  Cx*+13*+  X* 

Cx*+13*+  X* 

* 

Prom  the  submatrices  in  C4.93,  the  A  generalized 

i,2.* 

inverse  is  found  to  be: 


A  =  ST  = 

1,2.4 


Cx*+13*+  X* 


x*+  1 


Cx*-..13*^  X*  J 


o) 


C4.  103 


_ X _ 

Cx*+13*+  X* 
x*+  1 

.  Cx*+13*+  X* 


O 


o 


Finally,  the  results  of  making  M  orthogonal  to  T  and  S 


orthogonal  to  N  are  combined  in  order  to  produce  the 
Moor  e-Penrose,  the  A  ,  or  simply  the  A*  generalized 
inverse.  The  final  canonical  form  used  is: 


will  produce  the  generalized  inverse.  This  A^  Inverse  is: 


’ _ X _  _ xS; _  C4.12D 

Cx*y*+13CCx*+lD*+  x*l  CxV*+15CCx*+13*+  x*3 

_ x^+  1 _  xyCx^  -t-13 _ 

C  x*y*-*-lD  [  Cx*+1D*+  X*]  C x*y*+l D  t  C x* +1 D *-♦-  x*] 


This  then  is  a  detailed  example  of  how  the  ST  method 
can  be  used  to  sequentially  compute  representatives  of  all 
the  generalized  inverses  of  a  matrix  as  well  as  the  unique 
Moor e-Penrose  generalized  Inverse.  The  explicit  detail 
provided  in  exanqale  1  is  excluded  from  the  remaining 
examples.  The  reason  for  excluding  most  of  the  computational 
details  from  next  nine  examples  is  that  the  examples  come 
from  various  areas  of  optimization  theory  and  the  focus  of 
this  v/ork  is  on  finding  the  general  solution  using  the 
generalized  inverse,  not  on  the  mechanics  of  computing  the 
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inverses. 


Nonlinear  Unconstrained  Optimization 

Example  2:  Rosenbrock*s  function.  C 21  ;  223 . 

Rosenbrock’s  function,  also  referred  to  as  Rosenbrock's 
banana-valley  function  C44:413.  was  specifically  devised  as 
a  challenge  to  gradient -based  optimization  methods  and  has 
become  a  test  function  for  testing  computer-based 
algorithms.  As  such  it  is  often  used  in  comparison  studies 
of  optimization  techniques  C44  ;  38:120-1263.  The  function 
possesses  a  steep  sided  valley,  nearly  parabolic  in  shape, 
and  i s  def i ned  as ; 

fCx.y3  =  lOOCy  -  x*3*  +  Cl-x3*  C4.133 


The  maximum  or  minimum  pointCs3  of  this  function  are 
those  points  for  which  the  partial  derivative  of  the 
function  evaluates  to  zero.  Whether  or  not  the  function  has 
a  minimum  or  a  maximum  depends  upon  the  value  of  the  Hessian 
matrix  at  a  particular  stationary  point.  Expanding  the 
function  C4. 133  produces: 

fCx.y3  =  lOOy*  -  200x*y  +  lOOx*  +  1  -  2x  +  x*  C4.143 


and  determining  the  partial  derivative,  dfx^  and 
produces  the  system  of  equations: 


{ 


40Ox*  -  400xy  +  2x  =  2 


-20Ox  200y 


=  O 


which  may  then  be  written  in  the  matrix  form: 


ACx,y3  f  X  1 

'•  y 


=  BCx,y3 


where 


C4. 153 


C4. 163 


50 


ACx.yD 


C4.  175 


and 


(400x  *+  a  -400x 

-aoox  aoo 


BCx.yD  =  fa 

loj 


C4.  185 


It  Isn’t  known  whether  or  not  the  A  matrix  is  singular, 
but  using  the  generalized  inverse  technique  does  not 
restrict  the  computations.  Forming  the  initial  ST  canonical 
form  as  shown  in  example  1  and  performing  elementary  row  and 
column  operations  produces  the  following: 


The  operations  performed. 


verify,  were: 


C4.  195 


a 

-400x 

1  0  • 

o 

aoo 

0  1 

1 

o 

X 

1 

■ 

1 

0 

1/8  0 

O 

1 

0  1/aoo 

1 

aoox 

X 

1+  aoox* 

for  the  interested  reader  to 


Cl 5  multiply  column  two  by  Cx5  and  add  to  column 
one 

Ca5  multiply  column  one  by  C 800x5  and  add  to 
column  two 

C35  interchange  columns  one  and  two.  divide  row 
one  by  the  scalar  8,  and  divide  row  two  by 

aoo. 


eo 


In  analytical  geometry  and  optimization.  The  study  of  this 
local  behavior  is  often  conducted  as  a  result  of  first 
finding  the  variable  y  as  a  function  of  x,  or  say  x  as  a 
function  of  y.  Through  the  use  of  an  implicit  function,  a 
complex  functional  system  of  the  form: 


ei 


FCx  .X  >x  ,x  D  =0 

1  2  a  4 

GCx  ,x  ,x  ,x  :>  =0 

1  2  a  4 


can  be  simplified  into  the  form  of: 


C4.a35 


{FCx^,x^,fCx^,x^D,gCx^,x^DD  =  O 

GCx>x,fCx.xD.gCx,x33  =0 
12  12  1  2 


C4.  843 


where  x^=  fCx^,x^3  and  x^=  gCx^.x^D  are  the  Implicit 
functions  obtained  from  C4.833  C33:  479-5103 . 

In  optimization,  implicit  function  theory  can  be  used 
to  eliminate  variables  from  systems,  thereby  reducing  the 
dimension  of  the  problem.  Implicit  function  theory  is  also 
found  in  implicit  differentiation.  In  this  technique,  an 
Implicit  function  of  one  variable  is  found  in  terms  of  it's 
partial  derivative.  Once  found,  the  resulting  system  of 
equations  can  be  solved  simultaneously  for  the  maximum  or 
minimum  of  the  system  C 3: 172-1733. 

Example  3  C  33: 141 -1493 .  Suppose  the  common  solution  to 
the  following  system  of  equations  is  sought: 


{FCx.y.u.v3  =  x*+  2xy  -3xu  4y> 
GCx.y.u.v3  =  4xy  +  x*u  -  Syv  + 


C4.  253 


2=0 


The  above  system  may  be  written  in  the  now  familiar 
matrix  form  of  Ax  =  b  as  the  following: 


2x  -3x  4y 


Cl 


C4.263 


where  this  particular  representation  for  A  matrix  is  not 
necessarily  unique. 
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transforiMitlons  used  t,o  reduce  the  initial  ST  canonical  form 
to  the  final  ST  canonical  form  In  the  above  sequence  of 
matrices  were: 

Cl 3  multiply  column  1  by  C-23  and  C3^  and  add  to 
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columns  2  and  3  respectively 
C23  multiply  column  2  by  C-x*/4D  and  add  to  column  3 
C33  multiply  row  1  by  C2D  and  add  to  row  3 
C43  multiply  column  2  by  C-l/^D  and  add  to  column  1 
C5D  divide  row  1  by  CxD  and  row  2  by  C4x3 
Ce3  multiply  column  1  by  C-4y/'x5  and  add  to  column  4 
In  this  case  the  Is  obtained  by  the  product  of  the 

S  and  T  submatrices  above.  The  conditions.  CSTOACSTD  ^  CST7 
and  ACSHDA  =  A  may  be  verified  as  holding  true.  In  this  case 
the  general  solution  of  the  equation  C4.26^  Is  given  by  the 
following  form: 


X  =  A  b  Nz 

-  1.2“ 


C4.  28D 


where  z  Is  an  arbitrary  vector.  This  general  form  generates 

A# 

all  solutions  to  equation  C4.26>  by  appropriate  choice  of 
values  for  z.  This  solution  Is  carried  out  In  the  following 
equations: 


r  i/x 

-1/^x  ■ 

3+x?2  -8y/x  ■ 

X  = 

o  o  o 

_  _ 

l/4x 

O 

0  . 

(4)* 

-x?^4  2y/x 

1  0 

.  0  1  , 

C4.  29D 


for  arbitrary  z^  and  z^,  elements  of  z,  of  C4.28D  above.  So 
the  final  general  solution  is: 


X 

«w 


•  « 

1/x 

C3+x*/^z^  -  Cey/xDz^ 

-1/^X 

C-x*/'43z^  +  C2y/xDz^ 

0 

+ 

z 

1 

o 

•  « 

z 

L  * 

C4.  303 


Clearly,  x  cannot  assume  a  zero  value  as  the  solutions 
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are  undefined  for  that  value.  Since  z  is  an  arbitrary 
vector,  selecting  z-  C0.0:>  yields  a  specific  solution  to  the 
system,  namely.  x=l .  y*  ~iy&,  u»0.  and  v=0. 

In  more  classical  implicit  function  settings. 

FCx.y.u.vD  would  have  been  solved  for  u.  and  the  resulting 
expression  substituted  into  GCx.y.  u.vD.  The  resulting 
equation  would  be  solved  for  v  to  obtain  v»gCx.yD  as 
discussed  above.  This  implicit  function  would  then  be  used 
to  obtain  u=fCx.y^  again  as  discussed  above.  For  this 
particular  problem,  this  process  yields  C33:490D: 


(  u  =  fCx.yD 
V  =  gCx.yD 


Cx*+2xy3C6y-x^D  Cx*-«-2x%  12xy  -t-g^y 
3xC6y  -  x*3 
X*  5x*y  Igxy  +  C 


C4.  31  :> 


a4y  -  4x  y 


These  solutions  from  Implicit  function  theory  agree 
with  the  results  obtained  using  the  generalized  inverse  of 
the  mul ti parameter  system  of  equations.  Choose  x=l  and  y- 
-1/2  and  equation  C4. 315  yields  values  of  u  =  v  =  O. 

Example  4.  Given  the  system  of  homogeneous  equations: 


x  y  +  z 

X*  +  y*  +  z*  +  2xz  -  1 


O 

o 


C4.  325 


show  whether  x  and  y  can  be  considered  as  functions  of  the 
variable  z.  One  particular  method  of  solving  this  problem  is 
to  use  Jacobian  Determinants.  As  an  alternative  approach, 
consider  the  use  of  the  generalized  inverse  using  a 
formulation  of  the  form  Ax  ^  b  as  in  the  follovdng: 


S5 


From  equation  C4.  33!>.  the  A  matrix  can  be  placed  into 
the  initial  ST  canonical  form.  This  initial  form  can  then  be 
reduced,  using  only  a  series  of  elementary  transformations, 
to  the  following  final  ST  canonical  form. 

C4.  34D 


* 

1 

0 

0 

1 

O  1 

0  -C  x+2zD /c  y-x-2zD 

• 

0 

lzXy-x-2z3 

1 

-1 

C  z  -yD  yC  y-x-2z:> 

o 

1 

C  x+z3  XC  y-x-2zD 

.  0 

0 

1 

g 

From  this  final  form,  the  product  of  the  S  and  T 
submatrices  produce  the  A  generalized  inverse.  This 
inverse: 


ST  =  A 


y/Cy-x-Sz3 
-C  x+az3  /C  y-x-az3 


-1/Cy-x-az 
1  /'C  y-x-az5 


C4.  351) 


satisfies  the  consistency  condition  for  the  existence  of  a 
solution  to  C4.33^.  naimsly  A  A  b=:  b,  and  can  therefore  be 
used  to  obtain  the  general  solution  as  given  by  the 
equation,  x  =  A  b  -»-[I-A  Alw,  where  w  is  an  arbitrary 

~  i,*~  1,*  ~  ~ 

vector.  The  final  form  of  the  general  solution,  after 
computing  the  products  and  simplifying  the  expressions  is: 

C-l+yw^  +  zw^5  z'  Cy-x-azD  "j  C4.  3QD 
Cl  xw^  zw^Dz'  Cy-x-2z>  I 
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Now  equation  C41.383  is  the  general  form  of  the  solution 
to  equation  C4.33D.  This  implies  that  all  solutions  of 
C4.33?  can  be  generated  by  C4.3C^.  through  appropriate 
choice  of  w,  in  particular  Just  the  w^  component  of  w.  To 
find  a  solution  to  this  problem,  simply  choose  w^=  z  and  the 
general  solution  becomes: 

{X  =  C-1  +  yz  +  z*:>  /  Cy  -  X  -  azD  C4.  37D 

y  *  C  1  +  xz  +  z*3  /  Cy  -  X  -  2zD 
with  z  free  to  take  on  all  values  except  zero.  Thus,  x  and  y 

can  be  e^qsressed  as  functions  of  z. 

Nonlinear  Constrained  Optimization 

In  the  unconstrained  section,  use  was  made  of  the 

generalized  inverse  to  solve  the  system  of  equations  arising 

when  the  partial  derivatives  of  the  function  were  set  equal 

to  zero  Ci.e.  to  find  the  stationary  points!).  In  this 

section,  the  problem  is  that  of  constrained  optimization. 

The  type  of  problems  addressed  involve  objective  functions 

of  higher  order  than  quadratic,  and  concave  constraints.  The 

particular  technique  used  is  a  generalization  of  the 

quadratic  programming  technique  of  Nelson  discussed  in 

chapter  I I . 

Example  5.  Consider  the  following  problem  C38:3S0D: 

max  x*  +  X  C4.38!> 

ft  z 

s.t.  2x*  +  3x  <0 
ft  z 

X,  .  X.  a  o 

Since  the  unconstrained  maximum  of  the  objective 
function  is  unbounded  Ci.e.  infinite^,  the  constraint  in  the 
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problem  must  be  a  binding  constraint.  The  constraint  can 
therefore  be  rewritten  as  an  equality  constraint,  and  in  the 
Ax  »  b  format  as: 


3  1  f 

’■•I 

C4.  393 

1  ‘ 

^  1 

xj  ■ 

1®  J 

For  the  A  matrix  in  C4.  39:>.  a  representative  of  the 


A  class  of  generalized  Inverses  can  be  obtained  after 
producing  and  reducing  the  ST  canonical  form  in  the 
f ol 1 owl ng  manner  : 


f  2x  3 

1 

1 

1  0 

■  0  1 

-3/ax 


l/Ex 


C4.  403 


from  which  CA  3^^  =  CST3*^  a  Cl/C2x  ,  03.  The  solution  to 

i.2  1 

the  problem,  in  the  general  form,  is  found  using  the 
equation  x*A  b  +  CI-A  A3z,  where  z  is  arbitrary. 
This  solution  is: 


[X  ^  f  C9  -  3z  3  /  C2x  3  1  C4. 

■:l-f  '1 


413 


where  z^  e  z.  A  particular  solution,  obtained  by  choosing 
z^  =  O,  is  X  =  C 9/C 2x^3  ,  03.  Using  this  particular  solution 
in  the  constraint  gives  the  boundary  point  as  x  *  CY  4. 5,03 , 
and  an  objective  function  value  of  fC^  =  20.25  C38:3203. 

The  question  that  must  be  addressed  is  why  choose  z^  as 
zero.  Since  z^.  and  hence  x^,  is  free  to  vary  it  is  sensible 
to  select  the  value  to  get  the  most  gain  in  the  objective 
function.  The  x^  variable  is  raised  to  the  fourth  power, 
«4)ile  ^2  only  linear.  The  most  gain  per  unit  Increase 
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will  come  from  as  compared  to  x^.  It  Is  therefore 
advantageous  to  allow  x^  as  large  a  value  as  possible.  Due 
to  the  constraints,  this  requires  x^  to  go  as  low  as 
possible,  or  zero.  Thus,  the  insight  gained  from  the 
generalized  inverse  solution  is  augmented  by  knowledge  of 
the  problem  and  combined  with  an  understanding  of  the 
fundamental  subspaces  of  a  matrix  to  determine  the  optimal 
solution  to  the  problem. 

Example  6  The  previous  example  only  had  a  single 
constraint,  and  may  therefore  have  seemed  trivial.  However, 
it  served  to  demonstrate  the  technique.  This  next  example 
provides  a  more  detailed  insight.  Consider  the  following 
pr obi em  C  38: 333—335^ : 


-X  -  X 
1  2 


C4.  AZ:> 


s . t .  Sx 


>  1 


-.8x‘  -  ax  >  -9 

t  2 

X,  .  X.  >  O 

In  the  first  stage,  each  constraint  is  individually 
treated  as  an  equality  constraint.  The  second  stage  will 
consider  both  constraints  simultaneously  as  equality 
constraints.  Had  the  problem  been  large  Ci.e.  more 
constraints? ,  the  combinatorial  considerations  of  the 
subsets  of  equality  constraints  in  each  sitage  %i»ould  have 
been  significantly  more  involved. 

The  first  step,  of  the  first  stage,  is  to  consider  the 


constraint  2x^  -  x^  »  1 .  ignoring  the  second  constraint  for 
the  moment.  Placing  the  constraint  into  the  ST  format  and 
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reducing  Ihe  augmented  form  1.0  the  final  ST  canonical  form 
produces  the  following: 


[2  -X, 

*  1 

■  1 

0 

1  ' 

1  0 

1 

X  /  2 

1 

2 

j 

•  0  1 

J 

■  o 

1 

1 

C4.  43D 


Using  the  A =  s’’t=  Cl.OD  Cl/ED  =  Cl/E  ,  03  and  the 
&  #  2 

same  formula  for  the  general  solution  as  was  used  In  example 
5  yields  the  following  expressions: 


•  • 

(i/a  +  X  z  /2  "i  f  cz*+  13/  a  "i 

..  1-1  ..  1 


C4.  443 


When  this  expression  Is  evaluated  In  the  objective 

function,  the  resulting  function,  fCx3  =Cz'*-13*/a,  Is 

~  2 

minimized  when  the  z  variable  takes  on  a  value  of  -1. 


2 

Selecting  z^  »  -1  produces  the  particular  solution  of 
X  =  CO, -13  and  fCx3  =1.  However,  this  solution  violates  the 

a# 

non-negatl vi ty  constraint  for  the  problem  and  is  thus  an 
Infeasible  solution. 

In  a  similar  manner,  the  second  constraint  Is  now 
considered  as  an  equality  constraint,  C-.8x*-  2x^  =  93, 
Ignoring  for  the  moment  the  first  constraint.  This 
particular  constraint  produces  an  =  C-1.2Sx^,03  which 

Is  used  In  the  formula  for  the  general  solution  to  produce 
the  following  expression: 
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X  = 


C90  -  aoz  D/  C8x  D 
2  1 


C4.  45:> 


■  I  ^  J 

This  form,  when  used  in  the  objective  function, 
produces  a  potential  solution  of  x  =  C0,4.5I>  yielding  an 
objective  function  value  of  fC^  =  -4.5.  However,  this 
solution  is  not  feasible  for  the  first  constraint. 

This  completes  consideration  of  single  constraints.  In 
the  second  stage,  constraints  are  considered  as  equality 
constraints  in  pairs  of  equations.  Since  this  results  in 
considering  all  the  constraints,  this  is  the  last  step  for 
this  particular  problem.  The  solution  obtained  will  be 
either  feasible  and  optimal  or  it  will  be  infeasible.  If 
infeasible,  the  problem  is  inconsistent  and  the  solution 
obtained  will  be  the  best  in  a  least -squares  sense. 

Considering  both  constraints,  yields  the  following 
transformation  to  the  final  ST  canonical  form: 


C4.  463 


Since  the  matrix  involving  both  constraints  has  full 
rank,  the  product  of  the  S  and  T  submatrices  not  only 
produces  the  A  Inverse  but  the  and  A~^  as  well,  since 
all  are  the  same  matrix.  This  means  the  general  solution 
given  by  x  =  A*b  +  Cl  -  A*^A3z  reduces  to  simply  x  *  A^b  and 
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is  given  by; 


5C2  +  9x^D  1  C4.473 

X  =  I  4C5  +  X  X  3 

t  2 

45  -  2x 

< 

aC5  +  X  X  5 

t  2 

This  is  the  expression  for  all  solutions  to  the  pair  of 
equality  constraints.  Using  these  expressions  in  the 
objective  function  yields  the  functional  form  of: 

C4.  483 


fCx  ,x  D  = 
1  2 


45x  +  4x  -  80 

2 _ 1 _ 

4C5  +  X  X  !> 

i  2 


The  minimum  points  of  C4.48D  can  be  found  by  computing 
the  two  partial  derivatives,  and  setting  the 

equations  equal  to  zero,  and  solving  the  resulting  set  of 
equations.  The  partial  derivatives  are: 


-20C9X  -  25Cx  -  23 

_ 2 _ 2 _ 

C20  +  4x  X  3* 

t  2 


C4.  493 


-4C2x  +  53C2x  -  453 

_ 1 _ i _ 

C20  +  4x  X  3* 

1  2 

giving  possible,  feasible,  solutions  of  C22.5,6.63, 

C. 525,. 2223,  and  C2.  5,23  of  which  fC2. 5,23  =  -4.5  is  the 
best  solution  and  ultimately  the  optimal  solution  of  the 
pr obi em  C  38: 3353 . 

Control  theory 

Example  7.  In  a  1988  article,  &bek  C393  considered  a 
robust  control  theory  problem  of  the  form; 

Ax  +  =  I  C4.  503 

where  A  and  B  are  matrices  defined  over  polynomials  of  n 
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unknowns  and  I  is  a  vector  of  constants.  The  example  used  is 
the  following: 

Cl  -  vw3  X  +  v*y  =  1  C 4 . 51 D 

and  solutions  to  x  and  y  are  sought.  The  ^  generalized 
inverse  can  be  applied  to  this  problem,  after  first 
expressing  C4.51D  in  the  Ax  =  b  format  of: 

and  using  the  A  matrix  in  the  ST  canonical  form  to  obtain 


The  A  generalized  Inverse,  found  by  computing  the 
product  of  S  and  T,  is  C  1  ,  w/v  Since  the  consistency 


condition,  AA  b  =  b  holds,  the  general  solution  is  given 
by  x  =  A  b  +  CI-A  ADz,  for  arbitrary  z.  This  general 
form  works  out  to  the  following: 


X 


C4.  54D 


1  +  z  CvwD  -  z  Cv  ^ 

1  z 


w/v  +  z  C  w  -w/'v^  +  z  C 1  -wvD 


Now  the  expression  in  C4.54:>  is  an  expression  for  the 
general  solution,  meaning  that  all  particular  solutions  to 
the  problem  may  be  generated  by  choice  of  z.  These 
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particular  solutions  differ  only  in  the  homogeneous  portion 

of  the  solution.  For  Instance  letting  2  =  Cl.OD  gives  rise 

T  Z 

to  the  particular  solution  of  x  =  Cl+vw  .  w  ¥^ich  is  the 
only  solution  obtained  by  Sebek.  The  advantage  gained  by  the 
generalized  Inverse  technique  is  the  more  powerful  general 
form  of  the  solution  obtained. 


Counter  examol es 

The  examples  presented  thus  far  have  shown  how  the 
generalized  inverse  technique  can  lead  to  expressions  from 
which  to  determine  optimal  solutions.  However,  this  is  not 
always  the  case.  In  the  next  two  examples  an  explicit 
solution  to  the  problem  is  not  found,  but  at  the  same  time 
these  two  examples  show  that  the  general  form  of  the 
solution  can  still  provide  useful  information. 

Example  8.  Another  function  specifically  designed  to 
test  the  gradient  based  optimization  techniques  is  a  test 


function  due  to  Kantorovich  C44: 42-435.  This  function: 


The  A  matrix  Is  placed  into  the  ST  canonical  form  and 


reduced  using  a  series  of  elementary  transformations  to  the 
final  ST  canonical  form.  This  produces  the  following 
transition,  from  initial  ST  canonical  form: 


Just  as  in  example  S>  Rosenbrock * s  function,  A  is  a 
full  rank  matrix  meaning  =  A~*  and  the  solution  to  the 
problem  is  given  by  x  =  A^b,  which  produces  the  expression: 

1  -  xy  ..  C  4 . 605 

a  z  z 
X  -3x  y 

x"  -  3xy 
yCx"-3x*y*5 

The  expression  for  x  is  quite  complicated.  It  shows  the 
dependent  relationship  bet%i«en  x  and  y  and  Just  how  delicate 
the  process  of  finding  the  optimal  solution  can  be.  Although 
the  expression  for  the  general  solution  obtainexi  using  the 
generalized  inverse  proves  that  a  unique  solution  to  the 
problem  does  In  fact  exist,  the  problem  is  that  the  above 
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expression  for  x  does  not  help  find  that  optimal  solution. 

In  his  work  on  gradient  based  optimization  techniques.  Stein 
found  the  best  solution  to  date  for  the  function  C4.55D  as 
X  *  C.  99278  .  .306443  and  fCx3  =  .28173  x  10"“  C 44;  433. 

Example  9.  This  example  is  a  particular  nonlinear, 
constrained  optimization  problem  for  which  the  method  of 
Lagr  ange  mul t i pi 1 er  s  f ai 1 s  C  48; 653 ; 

min  Cx*  +  y*3‘^*  C4.613 

s.  t.  y*  -  Cx  -  13*  =  O 

The  generalized  inverse  technique  is  generally 
applicable  to  Lagrange  multiplier  optimization  Csee  Appendix 
A  for  deflnitlon3.  The  Lagrangian  function,  when 
differentiated,  produces  a  system  of  partial  derivatives, 
which  when  set  equal  to  zero,  provide  the  optimal  solutions 
to  the  problem. 

If  there  are  say  m  variables,  or  unknowns,  in  the 
problem,  and  n  constraints,  the  system  of  partial 
derivatives  produces  m'^^  equations  and  m-»n  unknowns. 

Generally  such  systems  produce  a  unique  solution.  The 
generalized  inverse  of  the  C m'«-n3 xC m'«’n3  matrix  is  used  to 
determine  the  optimal  values  of  the  undetermined  multipliers 
and  the  variables  of  the  problem. 

For  this  problem,  the  Lagrangian  function  is: 

LCx.y.\3  »  Cx*  +  y*3‘''*  +  \Cy*-  x*+  3x*-  3x  +  13  C4.623 

Taking  partial  derivatives.  8L/8y.  and  HiLy'tfK, 

and  rewriting  the  resulting  system  of  equations  Into  the 
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Ax  =  b  form  produces: 


l+6y*X-3xX  -3x*yX-3y\  6x*-3x* 

X 

*  • 

0 

2xyX  1  2y* 

y 

0 

-x*+3x-3  y  O 

.  -1. 

The  A  matrix  is  full  rank  so  again  A^  =  A  *.  The 
MACSYHA  expert  system  contains  an  inverse  function,  called 
INVERTCmatrix^ ,  so  MACSYMA  was  used  to  obtain  the  A”*  matrix 
for  C4. 63^  C 50:0-555.  Since  MACSYMA  does  not  compute  any 
generalized  Inverses,  had  A  been  singular,  the  ST  method 
would  have  been  used  to  compute  the  A*.  The  A”*  matrix 
provides  the  solution  to  the  problem  by  the  formula 
X  =  A  *b.  The  final  expression  is  then: 


3CC2x*+  Z:>y*><  -  X*  *  2x*5 
~  “  ZyCCGy*  -  3xy*  +  3x®  -  6x*5\  +  y*5 

-CCOx*  +  Ox5y*\*  +  COy*  -  3x5\  +  15 

and 


C4.  645 


C4.  655 

W  =  _ 1 _ 

C  C 1 2y**+C  -6x*+l  8x*-24x*+l  2x-85  y‘*+C  6x®-l  2x*5  y*5\+ 

1 2y*  +3x‘*-l  5x®  +27x*  -1 8x®  5 

Just  as  in  example  8.  this  final  expression  does  not 
yield  an  optimal  solution.  However,  using  the  generalized 
Inverse  to  obtain  the  expression  for  the  general  solution 
demonstrates  the  non-existence  of  a  finite  Lagrange 
multiplier,  for  the  problem. 

Common  Solutions 

The  last  theorems  in  Chapter  III  were  presented  to 
demonstrate  the  trend  towards  multi parameter  matrices. 
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Another  purpose  was  to  lay  the  theoretical  groundwork  for 
future  applications  of  parallel  processing  capabilities.  The 
next,  and  final,  example  demonstrates  the  applicability  of 
this  theory. 

Example  lO.  Consider  the  following  system  of  the  form 
of  Ax  =  b  C21D: 


2x  +  X® 

2  4 

Sx  y  +  X  y 


C4.  e6D 


In  this  particular  example,  first  the  solution  to  the 
equation  is  found  and  then  the  equation  is  decomposed  into  a 
system  of  n=2  matrix  equations.  This  system  of  matrix 
equations  is  then  solved  individually  and  also  using  the 
results  of  Theorem  3.13. 

First  the  A  generalized  inverse  of  A  is  computed. 

The  transformation  from  the  initial  ST  canonical  form  to  the 
final  ST  canonical  form  is  given  by; 


From  this  the  A  generalized  inverse  is  found  to  be: 

‘“■“■(Tit'  ”1  ■  (T  :i ““ 

which  when  used  in  the  general  solution  formula  used  in 

other  examples.  x«A  B-t-Cl  -  A  A^z,  yields: 

-  1.*  tA~ 


p  O.f  2*..*  i  f  px  O.fx  x».l 

l  1  ®  J  l  2x*y-*-x*y  J  [_  1  O  J  [.x*y  x®y+xy 


[-ax*  -  x^  f  l+x*  x"+  X  "t  f  z 

»..■  •[ -  -■ )[.;) 


C4.  69D 


Since  z*  =  Cz^,z^3  Is  an  arbitrary  vector,  choosing  a 

value  of  z*  =  Cl  ,  xD  will  give  a  particular  solution  to  the 

T 

problem.  When  used  in  C4. 693,  this  solution  is  x  =  Cl,x3. 

Next,  reconsider  C4. 663  as  a  system  of  matrix  equations 


of  the  form: 


{A  X  =  B 

. 

A  X  =  B 
2  2 


C4.  703 


where 


r  A,  =  C  X  X*.  13 
\  A  =  Cx*y  x*y+x 


B  =  C2x  +  X  3  C4.  713 

1 


t  A^  =  Cx  y  X  y-*-xy3  B^  =  C2x  +  x  y3 

The  intent  is  to  find  a  common  solution  to  the  set  of 

matrix  equations.  In  a  manner  similar  to  that  of  C4.673, 

each  of  A  and  A  can  be  reduced,  within  the  ST  canonical 
1  2 

form,  to  find  an  A^^  generalized  inverse  of  the  matrix.  To 
avoid  confusion  of  notation  with  the  subscripts  used  in 
C4.703  and  C4.713,  these  A  generalized  Inverses  are 
denoted  as  A^  and  A~  respectively  through  the  remainder  of 
this  example. 

From  the  ST  computations,  representatives  of  the  A^  and 
A~  inverses  are  found.  These  inverses  are: 


!‘:i 


A  = 
2 


^  2 
./  x  y 


]  “ 


.  723 
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A  check  of  Penrose  conditions  CID  and  C25  will  verify 

that  these  are  in  fact  A  inverses  of  A  and  A 

1,2  1  2 

respecti vel y. 

Theorem  3.  13  requires  that  each  of  the  conditions 
A  A  B  =B  and  A  A  B  =B  hold  for  each  of  the  equations  of 

till  2222 

C4.70D  to  have  a  solution.  To  demonstrate  A  A~B  =B  : 

111  1 


A  A  B  =B  = 
111  1 


[x  x'.l]  f  j  [2X-X*].[  1  ]  [2x«‘] 


and  to  demonstrate  A  A  B  =B  : 

2  2  2  2 


1  C4.735 


C4.  74D 

=  ^x*y  x®y+xyj  ^  ^  |i2x*+x*yj  =  [  ^  ]  |2x*+x*yj  =  B^ 

Since  C4.73!)  is  true,  the  general  solution  to  A  x  =  B 

i~  1 

is  given  by  x  =  Cl  ~  arbitrary  z.  This  is 

expressed  as; 

-•'ll  :j  — 

[-2x*  -x*"!  fl+x*  x*+x'>rz  "I 

ax.x-jM-x  ]l.;j 

Letting  z  =  Cl  ,  x3  in  the  above  gives  the  particular 
solution  to  the  problem,  =  Cl  .  x3 .  In  a  similar  manner, 
the  condition  verified  in  equation  C4.743  means  the  general 
solution  to  A  x=B  is  given  by  x*AB  +  CI  -  AADz  for  z  an 
arbitrary  vector.  This  expression  turns  out.  after 
simplification,  to  be: 

|C2  +  xV^/'y  j  I  O  C-l-x*D/x  j 


12  9  . 

lx  y  X  y+xy 


C4.  753 
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Letting  =  Cl  .  as  before  yields  a  particular 
solution  =  |c2-yD/y  ,  x^.  Though  this  appears  somewhat 
different  from  the  previous  result  In  equation  C4.75D,  It 
really  Is  not.  If  x  =  Cl  ,x3,  then  y=l .  If  y*!  for  this 
current  particular  solution,  then  x  =  Cl  ,  xi .  So  actually 
this  particular  solution  Is  equivalent  to  the  previous 
particular  solution. 

Thus  far  this  example  has  found  a  solution  to  C4.66D 
and  C4.703.  This  demonstrates  that  a  common  solution  to  an 
equation  must  satisfy  the  Individual  elements  of  that 
equation. 

The  next  step  is  to  actually  demonstrate  the  validity 
of  the  recursive  formulas  given  in  Theorem  3.13.  Using 
C4.703,  Theorem  3.13  gives  the  following  expression  for  the 
common  solution  to  the  set  of  matrix  equations; 


X  =  E  +  F  z  C4.  775 

~  2  Z~ 

where  z  is  an  arbitrary  vector.  Using  the  recursive 
relationships  of  Theorem  3.13,  namely: 


results  In  the  following  expanded  form  of  C4.775.  This  form 
is  the  formula  that  is  ultimately  evaluated  to  provide  the 
common  solution  to  the  set  of  equations; 


C4.  79D 


X .  [a;b,  .  (i  -  * 

[[i-V.)-[i-a;a)(a,-v;a]  (a,-a.v.]][  ] 

These  computations  finally  produce  the  following 
expression  for  the  general  solution  to  the  system  of 
equations  in  C4.705: 

C4.  80D 

and  letting  =  Cl  .  x^  yields  x  =  Cl  ,  x5 ,  the  common 
solution  to  the  system  of  equations  C4.70D  and  the  solution 
to  the  problem  as  originally  stated  C4.665. 

Concl usi on 

This  chapter  focused  on  using  the  generalized  inverse 
of  a  multi parameter  matrix  as  a  technique  to  help  solve 
various  optimization  problems.  The  technique  is  useful  in 
solving  systems  of  equations  and  sets  of  equations.  For 
quadratic  programming  types  of  problems,  the  generalized 
inverse  provides  for  an  easy  to  understand  methodology  to 
solve  the  problem.  Finally,  current  theory  regarding 
simultaneous  solutions  of  sets  of  matrix  equations  was 
demonstrated  for  the  case  of  n=2  matrix  equations. 


I 
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Conclusions  and  Recommendations 

This  thesis  has  examined  the  application  of  generalized 
Inverses  of  multi parameter  matrices.  Such  matrices  arise  In 
many  modern  applications,  such  as  multi -Input /multi -output 
systems,  highly  nonlinear  functions  In  optimization,  and  In 
modern  control  theory  problems.  The  use  of  these  generalized 
Inverses  provides,  quite  often,  a  powerful  analysis  tool. 

A  large  part  of  this  thesis  dealt  with  consolidating 
the  vast  amount  of  theory  regarding  generalized  Inverses. 

The  theorems  presented  In  Chapter  III.  though  not  proved  In 
this  work  Cexcept  for  Theorem  3.  13^.  form  the  basis  for  the 
current  research  into  multi parameter  matrix  theory.  The 
format  of  the  chapter  was  intended  to  provide  an 
understanding  of  the  theory  as  well  as  an  appreciation  of 
how  the  research  has  evolved  Into  the  multi parameter  arena. 

For  the  first  time,  the  ST  technique  was  tied  to  the 
Fundamental  Theorem  of  Linear  Algebra.  This  key  concept  of 
linear  algebra  was  discussed  at  some  length  In  Chapter  III 
because  there  Is  a  crucial  link  between  the  concept  and  the 
Idea  of  a  generalized  Inverse.  The  ST  technique  bridges 
whatever  gap  may  have  existed  and  provides  a  representative 
basis  from  each  of  the  four  fundamental  subspaces  associated 
with  any  given  matrix.  In  addition,  the  ST  technique  can 
produce  representatives  of  various  generalized  Inverses 
rather  than  being  limited  to  Just  computing  the  unique 
Moore-Penrose  generalized  inverse. 


Also  for  the  first  time,  an  explicit  proof  was  provided 
for  Theorem  3.13  C Appendix  which  deals  with  the  common 

solutions  of  a  system  of  matrix  equations.  The  recursive 
formulas  as  %^11  as  the  identities  used  within  the  formulas 
were  proved.  This  theorem,  an  extension  of  work  done  by 
previous  researchers  C26  ;  173,  sets  the  stage  for  possible 
future  applications  in  parallel  processing  of  systems  of 
matrix  equations,  with  either  constant  coefficients  or 
polynomial  coefficients  in  the  matrices. 

The  main  thrust  of  this  research  effort  was  to  explore 
the  applications  of  the  generalized  inverse  in  non-linear 
optimization  Involving  multi parameter  matrices.  This  was  the 
focus  of  Chapter  IV  in  which  various  types  of  nonlinear 
optimization  problems  were  solved  using  the  generalized 
Inverse  technique  as  a  basis. 

There  is  a  very  strong  interface  between  the 
generalized  Inverse  of  multi parameter  matrices,  and  the 
solution  to  a  system  of  equations.  This  is  easily  extended 
to  the  solution  of  Lagrangian  optimization  problems  since 
the  partial  derivatives  of  the  Lagrangian  function  yield  a 
homogeneous  system  of  equations. 

The  generalized  inverse  of  multi parameter  matrices  was 
shown  as  providing  the  capability  of  expending  Nelson’s 
optimization  algorithm  to  problems  of  higher  dimension  than 
quadratic.  Though  the  generalized  algorithm  Introduced  can 
be  combi nator 1 al I y  inefficient  for  large  problems,  the 
algorithm  is  easy  to  inclement  and  is  sufficient  for  many 


common  optimization  problems. 

Finally  an  explicit  example  of  solving  sets  of  matrix 
equations  for  a  common  solution  was  shown.  This  example 
demonstrated  the  applicability  of  Theorem  3.13  and 
highlights  a  potential  area  of  future  research,  parallel 
processing  of  matrix  equations. 

There  are  some  areas  of  future  research  that  can  be 
undertaken.  First  is  in  the  area  of  the  ST  computational 
algorithm.  This  technique  was  first  programmed,  in  FORTRAN, 
for  purely  numerical  matrices  in  1985  by  Murray  C27D.  His 
program  can  be  Improved  upon  in  the  sense  of  computer 
storage  requirements  and  in  numerical  accuracy. 

The  ST  technique  is  applicable  to  finding  the 
generalized  inverses  of  multi parameter  matrices.  The  ST 
technique  uses  elementary  transformations  to  compute  the 
inverses.  These  transformations  are  easy  to  implement  in  a 
computer  algorithm.  Expert  systems  such  as  MACSYMA  enable 
users  to  work  with  variable  element  matrices  in  symbolic 
form.  Such  systems  also  provide  some  kind  of  capability  for 
developing  macros  Csequences  of  system  commands  or 
procedures^  to  perform  certain  functions  thereby  increasing 
the  power  of  the  system.  There  is  already  at  least  one  macro 
in  MACSYMA  for  the  generalized  inverse  C113,  but  it  requires 
the  use  of  limits  and  for  large  matrices  is  not  as  efficient 
as  the  ST  technique.  One  very  Important  area  of  research 
%iK3uld  be  to  develop  the  ST  algorithm  for  the  MACSYMA 
environment.  Ideally,  this  program  would  be  written  in  LISP 
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since  MACSYMA  is  also  written  in  the  LISP  language. 

Another  potential  area  of  research  is  in  the  conwnon 
solutions  to  sets  of  matrix  equations.  In  particular,  the 
application  of  this  theory  to  a  parallel  processing 
environment.  Initially  such  work  would  be  limited  to  the 
application  of  constant  coefficient  matrices.  However,  the 
algorithms  developed  for  constant  matrices  should  easily 
extend  to  mul ti parameter  matrices.  Of  course  computer 
capabilities  must  provide  a  parallel  processing  environment 
for  systems  such  as  MACSYMA  in  order  to  extend  any  parallel 
processing  algorithms  to  mul tl parameter  matrices. 

Control  theory  was  just  briefly  touched  upon  in  this 
thesis,  but  a  new  area  of  research  is  in  parameterized 
families  of  systems  Ci.e.  mul tl parameter  matrices^  involving 
the  design  of  parameterized  controllers.  SSontag  C433 
discusses  such  a  problem  in  his  1985  tutorial  article.  In 
particular  would  be  to  design  these  controllers,  "in  the 
form  of  a  parameterized  controller  which  regulates  once  its 
parameters  are  properly  tuned"  C43:370D.  It  appears  this  may 
be  a  fertile  area  of  future  research. 


Appendix  A:  Glossary  of  Terms 


Deter mi nant .  Defined  as  the  sum  of  all  signed  elementary 
products  from  a  square,  nxn.  matrix  A.  An  elementary  product 
is  any  product  of  n  entries  of  matrix  A,  no  two  of  which 
shall  come  from  the  same  row  or  column.  The  elementary 
product  is  termed  a  ‘‘signed  elementary  product"  when 
multiplied  by  ±1,  dependent  upon  whether  the  product  is  an 
even  or  an  odd  permutation  C1:59-70D 

Diagonal  Matrix.  Matrix  with  all  zero  entries  except  along 

the  diagonal,  which  contains  any  non-zero  entries  in  the 

matrix.  In  more  formal  terms,  the  elements  a=Oifi^J. 

The  diagonal  elements  are  the  non-zero  elements  a  where 

i-j 

i  =  J.  Also  defined  as  an  upper  and  lower  triangular  matrix. 
Diagonable.  If  a  matrix  A  is  similar  to  a  diagonal  matrix, 
then  A  is  a  diagonable  matrix.  See  definition  of  similar 
matrices  C2:157D. 

Ei genval ues .  The  eigenvalues  of  a  matrix  A  are  the  scalar 

values,  X.  for  which  Ax  =  Xx  has  non-zero  solutions 
C  31 :  2643  . 

Ei gen vector s .  The  non-zero  solutions  of  Ax  =  Xx.  where  the 
X  scalars  are  the  eigenvalues  of  the  matrix  A  C 31; 2643. 
Elementary  transformations.  Very  commonly  referred  to  as 
elementary  row  and  column  operations.  These  are  operations 
performed  on  a  matrix  that  preserve  the  matrices  order  and 
rank.  These  transformations  are: 

C13  interchange  any  rows  Ccolumns3 
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C25  multiplication  of  each  element  in  a  row  CcolumnD  by 
a  non-zero  scalar 

C33  the  addition  of  a  scalar  multiple  of  one  row 
CcolumnD  to  another  row  C column]) 

Changing  the  term  scalar  in  C23  and  C35  to  polynomial 
defines  the  elementary  transformations  on  polynomial 
Cmul ti parameter D  matrices  C 2:  39,1881). 

Expert  System.  Computer  system  Cprimarily  software^  that  is 
programmed  to  exhibit  capabilities  normally  attributed  to 
human  experts  in  the  particular  specialty  area.  In  the 
current  context  the  expert  systems  exhibit  syinbollc, 
algebraic  reasoning  normally  associated  only  with  human 
mathematical  reasoning. 

Field.  A  field  is  a  communitlve  division  ring.  See 
definition  of  ring  CIO:  1985. 

Gradient.  If  the  function  FCx->  is  differentiable  at  a  point 

X  .  then  the  associated  vector  of  partial  derivatives 
o 

evaluated  at  x^  is  called  the  gradient  vector.  The  gradient 
vector  provides  information  regarding  the  direction  of 
steepest  ascent  C descents  along  a  function  from  a  particular 
point.  Thus,  it  is  the  key  aspect  of  Iterative  optimization 
techni ques. 

Hessian.  Matrix  of  second  partial  derivatives.  Valuable  in 
numerical  analysis  and  optimization  theory  as  the  Hessian  of 
a  function  provides  information  about  the  stationary  points 
of  a  function. 


A  homogeneous  set  of  equations 


is  oir  the  form: 


Ax  =  O 

Thus  every  solution  to  the  system  is  a  member  of  the 
nullspace  of  the  system.  If  there  are  more  unlcnovms  in  the 
system  than  there  are  equations,  then  the  system  has 
non-trivial  solutions.  This  means  there  is  a  solution  to  the 
system  other  than  simply  x  =  O  C 46: 585 

Identity  matrix.  Square,  diagonal  matrix  with  all  diagonal 
elements  equal  to  1.  Also  the  multiplicative  identity  of  the 
algebraic  field  of  matrices. 

Inner  product.  Cdot  products  Sum  of  the  element-wise  product 
of  two  vectors.  For  example. 

X  .  Y  =  y/'  Cx  y  D 

Lagrange  multipliers.  Optimization  method  involving  the 
derivatives  of  the  objective  function  and  the  equality 
constraints  of  the  problem.  A  maximum  or  minimum  point 
occurs  when  the  derivative  of  the  objective  function  equals 
zero.  The  constraint  derivative  already  equals  zero  Cslnce 
the  derivative  of  a  constant  right  hand  side  is  zero3. 

Rather  than  solve  for  each  variable  in  the  constraint  and 
back  solve  the  system  of  equations,  the  constraint  is 
multiplied  by  an  undetermined  value.  V  >  and  added  to  the 
objective  function.  The  partial  derivatives  of  the  resulting 
equation  produces  n  equations  in  n  unknowns  which  can  be 
solved  uniquely.  Usually,  the  system  is  solved  for  the 
values  and  the  optimal  values  can  then  be  determined 
C3: 1743. 
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Least -sauar es .  Method  of  estimating  parameters  of  an 
equation  where  the  function  minimized  is  the  squared 
difference  of  the  actual  data  value  and  the  value  predicted 
by  the  current  set  of  estimated  parameters. 

Markov  chains.  The  most  common  model  for  which  the  input 
and  output  relationship  is  random  is  the  Markov  process. 

When  this  process  is  discussed  with  resfsect  to  discrete 
time,  discrete  range,  the  process  is  termed  a  Markov  chain. 
The  Markov  chain,  with  n  states  in  the  state  space,  is 
typically  characterized  by  an  nxn  matrix  of  probabilities  of 
transition  from  state  to  state  C35:9925. 

Minimum-norm.  The  minimal  value  of  the  function  |Ax-b|. 
where  x  is  a  vector  of  estimated  parameters. 

Mi ni mum- var i ance .  A  mini mum- variance  estimator  of  a 

parameter  is  a  random  variable  with  the  propserty  of  having 
the  smallest  variance  among  any  other  estimators  of  that 
par amet er  C  25: 1 5^ . 

Maximum  Likelihood  Estimation  CMLE3.  A  parameter  estimation 
technique  that  maximizes  the  probability  Cor  likelihood^  of 
the  observed  sample  C23:372D. 

Norm  of  a  vector.  Also  referred  to  as  the  length  of  a 
vector.  Denoted  |  x  |,  the  norm  is  defined  as  the  square 
root  of  the  inner  product  of  a  vector  with  itself.  For 
example:  |  x  |  »  ■/ x  •  x  C1:95D. 


Null space.  The  set  of  solutions  to  Ax  »  O,  forms  a  vector 
space  called  the  null  space.  In  more  abstract  terms,  the 
null space  is  the  set  of  points  that  the  transformation 


matrix.  A.  maps  into  the  point  zero.  The  null  space  is  the 


kernel  of  the  transformation  given  by  the  matrix  A.  The 
dimension  of  the  null space  is  defined  as  the  nullity. 

Or thoQonal .  Two  vectors  are  said  to  be  orthogonal  if  their 
inner  product  Cdot  products  is  equal  to  zero.  Orthogonal 
vectors  intersect  at  90*  angles  Ci.e.  they  are 
perpendicular^ .  A  square  matrix  is  orthogonal  if  A~*=A^,  and 
the  equality  a‘'a  =  Aa’’  =  I  holds  Ca;1033. 

Orthonormal .  In  general  terms,  if  a  set  of  vectors  are 
mutually  orthogonal,  and  each  has  a  norm  of  one.  the  set  is 
an  orthonormal  set  of  vectors  C2:1053 

Parallel  processing.  Computer  processing  of  more  than  one 
task  on  the  same  computer,  simultaneously. 

Positive  definite  matrix.  A  matrix.  A.  is  positive  definite 
if  it's  quadratic  form.  x^Ax  >  O  for  all  x  O  C 38: 7683. 

Rank  of  a  matrix.  The  rank  of  a  matrix  A  is  the  dimension 
of  the  row  and  column  space  of  A.  The  dimension  of  the  row 
and  column  space  is  defined  as  the  number  of  vectors 
required  to  span  the  space  Cl: 1573. 

Range.  The  range  of  a  matrix  A,  or  the  transformation 
Induced  by  A.  is  defined  as  all  possible  values  of  Ax. 

Ring.  A  ring  Is  a  set  with  the  binary  oF>erations  of 
addition  and  multiplication.  The  addition  operation  is 
communltlve  with  additive  identity  O.  If  the  multiplication 
operation  is  communltlve.  the  ring  is  termed  a  communltlve 
ring.  If  the  ring  contains  a  multiplicative  identity,  such 
as  1.  and  each  element  has  a  unique  multiplicative  inverse 


in  the  ring,  the  ring  is  called  a  division  ring  CIO:  1955. 

Sj»t. .  Subset .  A  set  is  a  well  defined  collection  of  objects. 
A  subset  is  itself  a  set,  but  also  entirely  contained  within 
some  other  set  of  equal  or  larger  size  CIO: 25. 

Sinoular  matrices.  A  square  matrix.  A,  that  does  not 
possess  an  inverse  matrix,  A  In  these  cases,  the 
determinant  of  A,  detCA5,  is  zero.  All  non-square  matrices 
are  singular  matrices,  with  an  undefined  determinant. 

Slack  variable.  A  slack,  or  surplus,  variable  represents 
the  positive  difference  between  the  left  and  right  hand  side 
of  an  inequality  equation.  These  variables  are  used  in 
linear  programming  CLP5  to  transform  inequality  constraints 
into  equality  constraints.  For  example,  adding  the  slack 

var i abl e  S  all ows : 

1 

2x  +  X  -  3x  <25  — ^  2x  +  X  -  3x  +  S  =25 

1  2  a  12  3  1 

Smith  form.  Diagonal  form  of  a  multi  parameter  matrix,  AC\5, 
where  the  diagonal  polynomial  elements,  f^C\5,  are  monic  and 
f^CX5  divides  f^^^C\5,  for  every  1.  These  polynomial 
elements  are  the  invariant  factors  of  AC\5.  If  each  f.CX.5=l, 
the  Identity  matrix,  each  f^CX5  is  called  a  trivial 
Invariant  factor  C2:1885. 

Stationary  point.  Point  of  the  surface  of  the  function 
«^ere  the  function  is  neither  increasing  nor  decreasing, 
within  a  sufficiently  small  region  about  the  point. 

Similar  matrices.  Two  nxn,  square,  matrices,  A  and  B,  are 
similar  if  there  exists  some  non-singular  matrix,  P,  such 
that  B  =  P~*A  P.  The  matrices  A  and  B  are  said  to  be 
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equivalent  matrices.  This  definition  also  holds  if  the 
matrices  A  and  B  are  defined  over  the  ring  of  paolynomials 
CB:  156D. 

Triangular  matrix.  The  two  triangular  matrices  are  upper 
and  lower  triangular  matrices.  In  an  upper  triangular 
matrix,  each  a^^=  O  if  1  >  j.  Conversely,  a  lower  triangular 
matrix  requires  each  a.  =  O  if  1  <  J.  If  a  matrix  is  both 
upper  and  lower  triangular,  the  matrix  is  a  diagonal  matrix. 
Unbi ased  esti mator .  For  an  estimator  of  a  parameter  to  be 

unbiased,  the  long-run  average,  or  expected  value,  of  the 
estimator  should  be  the  parameter  being  estimated  C25;14I>. 
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Appendix  B:  Proof  of  Theorem  3.13 


Introduction 

This  theorem  has  a  couple  of  important  uses.  First,  it 
provides  necessary  and  sufficient  conditions  for  the 
existence  of  a  common  solution  to  a  system  of  matrix 
equations.  Secondly,  it  provides  a  recursive  algorithm  for 
computing  the  common  solution  to  the  system.  The  work  in 
this  appendix  verifies,  by  mathematical  induction,  the 
necessary  and  sufficient  conditions  and  the  recursive 
algorithm  supplied  by  the  proof. 

All  matrices  are  assumed  defined  over  the  complex  ring 
of  polynomials,  Further,  to  avoid  confusion  regarding 

the  subscript  notation,  a  generalized  inverse  is  denoted  as 
a”.  Subscripts  denote  matrix  numbering  only. 

The  method  of  proof  is  to  first  verify  the  necessary 
and  sufficient  conditions  for  the  system  of  equations.  The 
next  step  is  to  demonstrate  the  validity  of  the  general 
solution  expression. 

Theorem  3.13 

Let  A  e  C*”""  and  B  e  C**’"'  for  i  =  1 . m.  Define  the 

W  X, 

following  recursive  relationships: 


and 


D  = 

t 


B 

1 


F  =  I 

1 


a"  a 

1  t 


CB.  13 
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i 


F 

k  k  K-1 


D  =  B  -  A  E 

K  K  K  K-1 


E  =  E  +  F  CD 

K  K-i  K-1  K  K 


F  =  F  Cl  -  CCD 

K  K-1  K  K 


Then  Ax  =  B  .  for  i=l . m,  has  a  common  solution  if 

i.~  i 

and  only  if  C  C~  D,  =  D  for  i=l,.  .  .  ,m.  In  this  case  the 


general  common  solution  is  given  by 


X  =  E  +  F  z  CB.  BD 

~  m  m  ~ 

where  z  is  arbitrary. 

Pr  oof 
Case  n=l . 

Consider  the  following  system  of  equations: 

A  X  =  B  CB.  3D 

i~  1 

From  Corollary  3.2.1,  the  system  of  equations  given  by  CB.3D 
has  a  solution  if  and  only  if  the  following  consistency 
condition  is  true: 


A  a‘B  =  B  CB.  4D 

111  1 

The  solution  then  is  given  by  the  equation: 

x  =  AB  +CI-AADZ  Vz  arbitrary  CB.  5D 
~ii  ii~ 

Using  the  recursive  definitions  provided  in  the 
theorem,  equation  CB. ID.  and  directly  substituting  these 
into  equation  CB. 4D.  an  equivalent  form  becomes: 

C  C"D  =  D  CB.  6D 

111  1 

and  the  general  solution.  CB.5D  is  equivalent  to: 

x*E  +  Fz  Vz  arbitrary  CB.7D 

-  1  i~  - 
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Case  1=2. 


Consider  the  following  system  of  equations: 

A  X  =  B  CB  .  8D 

i~  1 

A  X  =  B 

* 

The  necessary  and  sufficient  conditions  are  that 
C^C^D^=D^.  This  may  be  expanded  out  using  the  definitions  as 
In  the  following: 

CB.  93 

C  C"D  =D  ^  Ta  Cl  -  A'A  3I  Ta  Cl  -  A"A  il'D  =  D 

z  2  2  2  I  2  i  t  J  I  2  t  t  J  2  2 

Consider  the  expression  -  A^A^^j.  Suppose  that  A^ 

and  A  are  matrices  defined  over  C**"**,  then  CI-A~A3  Is  qxq, 

and  the  entire  expression  Is  pxq.  Since  the  expression, 

P  =  [^2^^  ~  ^  matrix,  and  an  element  of  C*‘*'‘*, 

there  exists  a  generalized  Inverse,  P~,  such  that  PP”P  =  P. 

Thus  equation  CB.93  reduces  to  simply  D  .  This  verifies  the 

2 

necessary  and  sufficient  conditions. 

The  general  solution  is  given  by: 

X  =  +  F^z  V  z  arbitrary  CB.  103 

Again  using  the  recursive  definitions,  CB.  103  can  be 
expanded  out  to  the  following  form; 


X  =  A~B  +  CI-A'A  3CA  -A  A“A  3"CB  -A  A"B  3  + 

~  11  11  ZZll  2211 

CI-A”A  3  fl  -  CA  -A  A'A  3~CA  -A  A"A  3 
111,  2211  2211 

Now  that  the  formula  Is  in  terms  of  the  matrices  given 

In  CB.  83,  the  solution  is  easily  verified.  Premultiply  by  A^ 

and  CB. 113  becomes: 


h 


CB.  113 
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CB.  12D 


A  X  =  A  A'B  +  CA-AAADCA-AAA3CB-AABD  + 
1~  11  1111  2211  2211 


CA  -A  A'A  D  fl  -  CA  -A  A"A  D'CA  -A  A‘A  dIz 
1  lilt  2211  2211 

Since  CA  -A  A"A  D  =  CA  -  AD  =  O.  CB.12D  reduces  to: 

1111  1  1 


Ax  =  AAB  =  B 

i~  111  1 


CB.  13D 


Premultiply  CB.  IID  by  A^  and  the  result  is: 

A  X  =  A  A’B  +  CA  -A  A"A  DCA  -A  A"A  D‘CB  -A  A'B  D  + 
2~  211  2211  2211  2211 


CB.  14D 


CA  -A  A 
2  2 


'a  D  fl  -  CA  -A  A"A  D"CA  -a  A‘A  dIz 
lit  2211  2211 


The  homogeneous  portion  of  CB. 14D  falls  out  of  the 


equation,  as  demonstrated: 


CA  -A  A 

2  2  I 


.'a  d  fi  - 

»  *  L 


CA-AAADCA-AA 
2  2  11  2  2 


CB.  15D 


CA  -A  A  A  D  -  CA  -A  A  A  DCA  -A  A  A  D  CA  -A  A  A  D 
2211  2211  2211  2211 


CA-AAAD  -  CA-AAAD  =  0 
2211  2211 

Furthermore.  CA-AAADCA-AAADCB-AABD  is 

2  211  2  211  2  211 

equal  to  CB  -A  A~B  D.  Thus  CB.  14D  becomes: 

2  2  11 


A  X  =  A  A  B  +  B  -  A  A  B  =  B 
2'<  211  2  211  Z 


CB.  16D 


Case  n=3. 


Consider  the  following  system  of  equations; 


A  X  =  B 

i~  1 

A  x  =  B 
2-  2 


CB.  17D 


A  X  «  B 
a~  s 

The  necessary  and  sufficient  conditions  are  that 


97 


CCD  =D  .  This  may  be  expanded  out  using  the  definitions  as 

3  3  3  3 


in  the  following: 


CA-AAAD-CA-AAADCA-AAAD  CA-AA 
3311  3311  2211  22 


CB.  18D 


fcA  -AAA  D-CA  -AAA  DC  A  -A  A"A  D'CA  -A  A"A  dI  D 

^3311  3311  2211  221lJ  3 

where  the  term  has  not  been  expanded.  Just  as  in  the  n=2 

case  considered  above,  each  term  in  parenthesis  is 

considered.  It  is  already  established  that  P=CA  -A  A  A  D  is 

2  2  11 

pxq,  provided  each  of  the  A^  matrices  are  pxq.  In  a  similar 
Tnanner  ,  the  R=CA  -A  A  A  D  expression  can  be  shown  to  be  a 

3  3  11 


pxq  matrix.  Thus  CB.  18D  may  be  rewritten  as: 

C  R-RPPD C  R-RPPD  *  D 


CB.  19D 


The  expression  CB.  19D  is  equivalent  to  CB.  18D,  just 
easier  to  read.  From  CB. 19D,  it  is  easy  to  see  that  the 
matrix  defined  by  CR-RPPD  and  its  generalized  inverse 
C R-RPPD  cause  the  following  to  hold  true: 


CCD  =  CR-RPPD CR-RPPD  D  =  D 

3  3  3  3  3 


CB.  80D 


This  then  verifies  the  necessary  and  sufficient 
conditions  for  the  case  i=3.  The  general  solution  is  then 
given  by  the  expression,  ?  “  ^  5  arbitrary,  which, 

when  expanded  out  using  the  recursive  definitions,  is 
equivalent  to  the  following: 


X  =  A  B  +  CI-A  A  DCA  -A  A  A  D  CB  -A  A  B  D  + 
*-  11  11  2211  2211 

CI-A"A  D  fl  -  CA  -A  A“A  D'CA  -A  A~A  dI  . 
lli.  2211  221lJ 


CB.  aiD 
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fcA  -AAA  D-CA  -A  A'A  DCA  -A  A~A  D”CA  -A  A"A  dI 
1.3  311  3  311  2  214  2  21lJ 

fcB  -A  A"B  D-CA  -a  A'A  DCA  -A  A'A  D‘CB  -A  A"B  + 

1.3  311  3  311  2  211  2  21lJ 

CI-A"A  3  fl  -  CA  -A  A”A  D'CA  -A  A"A  I)!  . 

lit.  2211  221lJ 

[I  -fcA  -A  a"a  5-ca  -a  a'a  :>ca  -a  a"a  d"ca  -a  a~a  . 

L3  311  3  311  2311  221lJ 

fcA  -A  a"a  3-ca  -a  a'a  dca  -A  a'a  :>~ca  -a  a'a  d11  z 

L3  311  3  311  2  211  2  21lJJ~ 

This  expression  is  verified  by  premulti plying  the 
expression  by  A  .  A  .  and  A  .  First  the  A  solution  is 

12  3  1 

verified; 

A  x  =  A  a'B^  +  CA  -A  a'a  DCA  -A  A'A  3'CB  -A  A'B  3  + 

1~  111  1111  2211  2211 

CA  -A  a'a  3  fl  -  CA  -A  a'a  D'CA  -A  A'A  dI  « 
llllL  2211  221lJ 


CB.  2a:> 


|CA  -A  a'a  d-ca  -a  a'a  :>ca  -a  a'a  >"ca  -a  a'a  >1 

'3311  3311  2211  221lJ 


[' 


CB  -A  A  B  D-CA  -AAA  DCA  -A  A  A  !>  CB  -A  A 

3311  3311  2211  22 


CA  -A  a'a  D  fl  -  CA  -A  a'a  D'CA  -A  A'a  d]  « 
lllll.  2211  221lJ 

[I  -  fcA_-A  a'a  D-CA,-A  a'a  DC  a  -A  A'A  D'CA  -A  A'a 

13311  3311  2  311  221lJ 

fcA  -A.A'A  D-CA  -A,  a'a  DC  A  -A  A'A  D'CA  -A  A'A  dII  z 

13  311  3  311  2  211  22I1JJ- 

Since  the  expression  CA  -A  A~A  D=  A  -  A  =  O,  the 

1111  11 


entire  expression  reduces  to: 


Ax=AAB  =B 

i~  111  1 

Premultiply  CB.  21D  by  A^  to  obtain: 


CB.  23D 


99 


CB.  24D 


Ax  =  AAB  +  CA-AAA  3CA  -AAA5  CB-AABD  + 
2~  211  2211  2211  2211 


CA  -A  A 
2  2 


“a  D  fl  - 

*■  *  L 


CA-AAAD  CA-AA 
2  2  11  2  2 


CA  -AAA  D-CA  -AAA  DCA  -A  A  A  D  CA  -A  A 
■  sail  sail  2211  22 


(' 


CB  -AABD-CA-AAADCA-AAA3CB  -A  A 

3911  3311  2211  22 


Iv] 


CA  -A  A 
2  2 


"A  D  fl  - 

»  ‘  I 


CA  -A  A  A  D  CA  -A  A 
2  2  11  2  2 


>■0 


(' 


-  CA  -A  A  A  D-CA  -A  A  A  DCA  -A  A  A  D  CA  -A  A 

9911  9911  2911  22 


aO" 

A^]]  5 


CA  -AAAD-CA  -AAA:)CA  -AAAD  CA  -A  A 

9911  9911  2211  22 


Next  consider  the  terms  in  CB.BA!)  indicated  by  the 
arrows.  Each  of  these  simplify  to: 


CA  -AAA  D-CA  -AAA  DCA  -A  A  A  D  CA  -A  A 
>2211  2211  2211  22 


:v] " 


CB.  BSD 


(■ 


-A,5]  =  O 

This  causes  the  entire  CB. 84D  expression  to  simplify  to 


CA-AAAD  -  CA-AA 
2  2  11  2  2 


Just: 


Ax  =  AAB  +  CB-AAB3=B  CB.  265 

2~  211  2  211  2 

Premultiply  CB.215  by  A  to  obtain: 


CB.  275 


Ax  =  AAB  +  CA-AAA5CA-AAA5CB-AAB5  + 

9~  911  9911  2211  2211 

CA  -A  A"A  5  fl  -  CA  -A  A'A  5~CA  -A  A"A  5!  • 
a9ll^^  2211  2211J 

fcA  -A  A"A  5-CA  -a  A"A  5CA  -A  A"A  5"CA  -A  A~A  5! 

1,9  911  9  911  2  211  22I1J 

fCB  -A  A”B  5-CA  -A  A”A  5CA  -A  A"A  5‘CB  -A  A“B  5] 

9  911  9  911  2  211  22I1J 
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CA  -A  A"A  D  fl  -  CA  -A  A"A  D'CA  -A  A"A 
satil.  2*11  2211J 

[I  -fcA  -A  A"A  D-CA  -a  A“A  DC  a  -A  A~A  D“CA  -A  A~A  dI”  • 

l^asii  sail  2  ail  2211J 

fCA  -A  A'A  D-CA  -a  A'A  DC  a  -A  A"A  D"CA  -A  A”A  dI  |  z 
a  ail  a  ail  2211  2  2iiJJ~ 

Consider  the  expression  pointed  to  by  the  first  arrow 


in  CB.27D.  This  multiplies  out  to: 


C  B.  28D 


fcA  -A  A"A  D  -CA  -A  A'A  D  CA  -A  A'A  D'CA  -A  A'A  dI  • 
l.aaii  aaii  2211  2211J 

fcA  -A  A'A  D-CA  -a  A'A  DCA  -A  A'A  D'CA  -A  A'A  dI”  » 

(^3  311  3311  2211  2211J 

fcB  -A  A'B  D-CA  -A  A'A  DCA  -A  A'a  D'CB  -A  A'B  dI 
^.a3ll  aaii  2211  2211J 

=  CB  -A  A'B  D-CA  -A  A'A  DCA  -A  A'A  D'CB  -A  A'B  D 

aaii  aaii  2211  2211 

The  expression  pointed  to  by  the  second  arrow  in  CB.27D 


reduces  to  the  following: 

CB.  29D 

fcA  -A  a'a  D-CA  -A  A'A  DCA  -A  A'A  D'CA  -A  A'A  d1  - 
t^aaii  3311  2211  2211J 

fcA  -A  a'a  D-CA  -A  A'A  DCA  -A  A'A  D'CA  -A  A'A  d1  =  O 

t.3  311  a  ail  2211  2211J 

Using  CB.28D  and  CB.29D.  the  expression  for  the  general 


solution,  CB.27D,  reduces  to-. 


Ax  =  AAB  ♦CA-AAA  DCA  -A  A  A  D  CB  -A  A  A  D  + 
a-  311  aaii  2211  2211 


CB.  30D 


B  -  AAB  -  CA-AAADCA-AAADCB-AAAD=B 
a  ail  3311  2211  2211  8 


Suppose,  for  any  k ,  A^x  *  B^  for  1=1 , 


k  has  a  common 


solution  if  and  only  if  CCD=  D.  for  i=l . k.  And 

*.  V  i  i 

sup>pose  this  solution  is  given  by: 
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X  =  E  +  F  z  V  z  arbitrary 
~  k  k~  ~ 

Case  n=k+l. 

Consider  the  following  system  of  equations; 


CB.  31D 


A  X  =  B 
1  ~  » 


CB.  32D 


A.  X  =  B 
k-H~  k*l 

If  the  common  solution  to  A  x=B  is  the  same  as  the 

k+i~  k+i 

common  solution  to  the  set  of  the  first  k  equations,  then: 

A^  CE^  +  F^zD  =  B^  CB.  33D 

k*l  k  k~  k+1 

^  A^  F^  Z  *  B^  -  A^  E^ 
k+l  k  ~  k  ♦  i  k+1  k 

which,  from  the  definitions,  is  equivalent  to 


k+l  **  k+1 


CB.  34D 


For  CB.  34^  to  have  a  solution.  Corollary  3.2.1  requires 
that  C.  C.  D.  =D.  .  In  this  case,  the  general  solution 

k+l  k*i  k+i  k+i  ** 

is: 


z  =  C^  +  Cl  -  C,  C^  D  z 

~  k+l  k+l  k+l  k+l 


CB.  35D 


The  expression  in  CB.  35!)  can  be  used  along  with  CB.  31) 
to  conclude  that  for  CB. 32)  to  have  a  common  solution. 

C^  C  ■  CB.36) 

k+l  k+l  k+l  k+l 


and  the  general  solution  is: 


CB.  37) 


=  E.  +  F 
k 


u  fc,.'  I 

k  k  +  l 


k+l 


Cl  -  5 

k+l  k+l 


K  +  F  Cl  -  C^  C,  )z 

k  kk+lk+t  k  k+l  k+l  ~ 


Using  the  definitions  for  the  E.  and  F,  terms  defined  in 

k  k 

the  theorem,  and  updating  for  the  current  situation; 
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CB.  38D 


true  for  all  k.  This  completes  the  proof. 
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